Prediction using the Model - Caching Methodology 1

4.4 Caching Methodology 1 - Comparing with Cached Images

4.5.4 Prediction using the Model

After training the model, it is loaded into the program prior to beginning the acquisition of video input and used to predict the video frame following grayscale conversion, face detection, and face cropping, as described in Chapter 4.4.5. After cropping the faces, they will be resized to the height and width specified during the training and validation data generation process. After resizing and cropping, the resized and cropped numpy.ndarray type is expanded to a two-dimensional numpy.ndarray.

Due to the fact that the trained model will use an array type image for prediction, no additional data type conversion is required, as OpenCV reads the video frame as a numpy.ndarray.

Bachelor of Computer Science (Honours)

Faculty of Information and Communication Technology (Kampar Campus), UTAR

55 After cropping the face and converting it to the data format accepted by the trained model, the trained model is used to predict the cropped and processed face, returning an array containing eight float numbers indicating the predicted confidence level for eight facial expression classes. The returned array's indices correspond to different expression classes, and the argmax function returns the index with the highest confidence value. The index with the highest confidence value is then converted to a word by accessing the corresponding index in a string array containing the index keys, which are obtained via the train data generator's class_indices.keys.

Figure 4.5.4.1 The indices keys of trained model prediction array

# Before reading the video input

classifier=tf.keras.models.load_model('Saved_Model\ResNet50_FER.h5') emotion_label=['ANGRY', 'CALM', 'CONFUSED', 'DISGUSTED', 'FEAR', 'HAPPY', 'SAD', 'SURPRISED']

# As reading the video input and when the frame count is the multiple of FPS

if cur_frame % FPS == 0:

box.drop(box.index, inplace=True)

gray = cv2.cvtColor(resize, cv2.COLOR_BGR2GRAY) faces = face_cascade.detectMultiScale(gray, 1.1, 4) for (x, y, w, h) in faces:

tempdict={'x':[],'y':[],'w':[],'h':[],'Emotion':[]}

tempdict['x'].append(x) tempdict['y'].append(y) tempdict['w'].append(w) tempdict['h'].append(h)

cropped = resize[y:y + h, x:x + w]

cv2.rectangle(resize, (x, y), (x+w, y+h), (0, 0, 255), 5) resized=cv2.resize(cropped, (IMG_HEIGHT, IMG_WIDTH)) a = np.expand_dims(resized, axis = 0)

array = classifier.predict(a, batch_size=1, verbose=1) answer = np.argmax(array, axis=1)

emotion_label[answer[0]]

Bachelor of Computer Science (Honours)

Faculty of Information and Communication Technology (Kampar Campus), UTAR

56 4.5.5 Teacher-Student Machine Learning

To train the model using teacher-student machine learning, the cropped face will be sent to AWS Rekognition and stored in the base dataset directory based on the AWS Rekognition result when the trained model predicted the expression with a confidence level of less than 0.8. When the level of confidence is greater than 0.8, the predicted outcome is displayed directly on the video frame. After the video finishes, the train_model() function will be called again to train a new model, beginning with a split of the base dataset directory with the addition of new data. As the number of requests sent to AWS Rekognition gradually grows, the size of the base dataset grows as well, providing additional data that is verified by AWS rekognition for training the new model and resulting in a more accurate and reliable FER model.

if(array[0][answer[0]]<0.8):

ret, buf = cv2.imencode('.jpg', cropped) rekognition_response =

weimun19_1utar_rekognition.detect_faces(Image={'Bytes':buf.tobytes()}

, Attributes=['ALL'])

#print(rekognition_response)

results=rekognition_response.get('FaceDetails') emotion='Call AWS : Not Detected'

for result in results:

face_emotion_confidence = 0 face_emotion = None

for emotion_result in result.get('Emotions'):

if emotion_result.get('Confidence') >=

face_emotion_confidence:

face_emotion_confidence = emotion_result['Confidence']

face_emotion = emotion_result.get('Type') print(face_emotion)

emotion='Call AWS : '+face_emotion

cv2.imwrite("same_person/"+face_emotion+"/"+str(datetime.no w().strftime("%m%d%H%M%S"))+".jpg",cropped)

cv2.putText(img=resize, text=emotion, org=(x+15, y-15), fontFace=cv2.FONT_HERSHEY_TRIPLEX, fontScale=0.8, color=(0, 255, 0),thickness=1)

else:

emotion="Model: "+emotion_label[answer[0]]

cv2.putText(img=resize, text=emotion, org=(x+15, y-15), fontFace=cv2.FONT_HERSHEY_TRIPLEX, fontScale=0.8, color=(0, 255, 0),thickness=1)

Bachelor of Computer Science (Honours)

Faculty of Information and Communication Technology (Kampar Campus), UTAR

57 4.6 Summary

The straightforward FER implementation and two potential optimizations, down sampling and spatial trimming, are primarily implemented in android application development to make the demonstration more feasible and to show the effect. At the same time, the potential optimization of Caching methodologies is carried out in a Jupyter Notebook to facilitate evaluating the methods' viability. Next, the following chapter will evaluate the potential for optimization of grayscale conversion, face detection, face cropping, comparison to cached expressions, and adoption of a teacher- student machine learning model, as well as finalizing the implementation from acquiring the input to processing it, recognizing expressions, and finally displaying the output.

Bachelor of Computer Science (Honours)

Faculty of Information and Communication Technology (Kampar Campus), UTAR

Chapter 5

Experiments 5.0 Overview

This chapter will discuss the evaluation of the proposed preprocessing steps' processing time, elapsed time, and required bandwidth for calling the AWS Rekognition with various image types. Additionally, this chapter will discuss the total processing time required to obtain the AWS Rekognition result with and without preprocessing steps, as well as the accuracy associated with using various image types.

Finally, the proposed caching methodologies are also evaluated, as the accuracy of the trained model in a teacher-student machine learning approach, as well as the caching methodologies’ effect on the number of AWS requests.

This chapter is divided into two sections: the experimental design and the results of the experiment. The experimental design will demonstrate how the evaluation is carried out in accordance with the project's objectives. The experimental results, on the other hand, will compare and analyze the results obtained through a series of evaluations.

Eventually, the final FER implementation will be determined and explained in the summary of this chapter, considering all the evaluation results.

Dalam dokumen UNIVERSITI TUNKU ABDUL RAHMAN (Halaman 68-72)