TITLE PAGE

I declare that this report entitled “DEVELOPMENT OF PERSONAL IDENTIFICATION APPLICATION FOR VIDEO CONSOLIDATION” is my own work, except as cited in the references. Ng Hui Fuang who gave me this opportunity to design a person identification application for video surveillance.

LIST OF TABLES

LIST OF ABBREVIATIONS

INTRODUCTION

Problem Statement and Motivation
INTRODUCTION 1.2 Project Scope
INTRODUCTION 1.5 Historical Development

Report Organisation

LITERATURE REVIEW

Therefore, it should work together with the extraction of other functions to increase the effectiveness of the person identification for video surveillance. In Chapter 3, the flowchart and use case diagrams for the personal identification application were shown with a description.

Figure 1-5-1: Person identification from camera. [6]

LITERATURE REVIEW

LITERATURE REVIEW 2.7 Large Intra-class Variation

According to [7], since there was no proper research based on the feature matching scheme, scientists had proposed the enhanced Bag of Features (BOF) based on the Speeded Up Robust Feature (SURF) algorithm for person re-identification. On the other hand, Table 2-12-1 is the summary of the challenges faced by the researchers and the solutions to solve the problems.

Figure 2-11-1: Positive Ranking List and Negative Ranking List [16]

SYSTEM METHODOLOGY/APPROACH

System Flowchart for the Application
Use Case Diagram for Application

Use Case Description for Uploading Images
Use Case Description for Uploading Videos
Use Case Description for Watching Person Identification Videos

SYSTEM METHODOLOGY/APPROACH Use Case ID UC03

Use Case Description for Downloading Person Identification Videos
Activity Diagram for Application
Timeline

SYSTEM METHODOLOGY/APPROACH 3.4.2 FYP1 Timeline
SYSTEM METHODOLOGY/APPROACH 3.4.3 FYP2 Timeline
SYSTEM DESIGN

Trigger user presses the 'Show person identification videos' button to view the person identification videos at the same time. Trigger User presses 'Download Video 1' hyperlink or 'Download Video 2' hyperlink or 'Download Video 3' hyperlink or .

Figure 3-2-1: Use Case Diagram for Application

SYSTEM DESIGN

System Design Concepts

Person Detection: YOLOv3
Person Tracking: DeepSORT
Person Identification: Self-Implemented Model

Reason of Implementation of Self-Implemented Model
Data Augmentation
Convolutional Layers
Max Pooling
ReLU Activation
Dropout
Flatten

Coinciding with the name YOLO, the prediction layer has 1 x 1 convolutions as shown in Figure 4-1-2, and the size of the predicted card is the same as the size of the previous element. In the bounding box, it has an x-coordinate (x) and y-coordinate (y) in the upper left corner of the bounding box, width (w) of the bounding box, height (h) of the bounding box, and the confidence score of the box. The Manhattan distance metric, which was considered in the nineteenth century, performs the sum of the absolute differences between the vectors, and then the square root of them, as shown in Figure 4-1-7.

On the other hand, the Euclidean distance performs the sum of the squared differences between the vectors, then their square root, as shown in Figure 4-1-8. After that, it obtains the cosine distance by subtracting the cosine similarity from one [24], as shown in Figure 4-1-9. After that, the images of the person to be identified were divided into the set of questions.

For good practice, the person in the image to be identified should be from different cameras, having a different angle. Because of the outage, packet normalization is not performed, since packet normalization does not perform well in the presence of an outage.

Figure 4-1-2: Architecture of YOLOv3 [21]

SYSTEM DESIGN 4.1.3.8 Dense Layer

Adam Optimization
Sparse Categorical Cross Entropy
Softmax
Summary of the Model
System Design Procedures

Person Identification Library
Building Own Model for Person Identification

After that, the model was ready to be compiled with the Adam optimizer and Cross categorical sparse entropy. Gradient burst occurs when the model has large parameter updates, caused by large gradient accumulation. This scenario makes the model unable to learn effectively from the training data.

In Figure 4-1-18, the model suffers from performance degradation, especially in the bottom layer, as the update of gradients in the bottom layers is miniscule. On the other hand, the model shown in Figure 4-1-19 suffers performance degradation as the . Before choosing the model with 6 convolutional layers, the model with 4 convolutional layers with a dense layer of 256 dimensions and the model with 5 convolutional layers with a dense layer of 512 dimensions were trained.

However, the CNN model consisting of 6 convolutional layers with a dense layer of 1024 dimensions performed best, the model has no problems with overfitting or underfitting. Another reason for choosing the model with 6 convolutional layers was that the number of training datasets increased as new videos and images of the target were uploaded to this application.

SYSTEM DESIGN Model with 6

Target Prediction
Person Detection

After flattening, the dense layer had a dimension of 1024, and the output size will be consistent with the number of classes in the data library. If the person in the figure-of-eight images was predicted as 'None', it would be passed to video tracking, only for image retrieval from one of the videos, and the model would be retrained according to the new dataset in the Training Library. On the other hand, if the person in the eight images was not predicted as 'None', it proceeded to person identification directly without proceeding to person tracking as before.

The final PID was finalized by selecting the PID with the highest frequency in 'target_list'. Person tracking is implemented to further retrieve the bounding box of the person to be trained for person re-identification. It can detect a total of 80 classes as listed in 'coco.names', such as person, bicycle, car, motorcycle, airplane and more.

In video processing, the name list of 'coco.names' was needed for person detection. Therefore, read and load the class list in 'coco.names' into an array "class_names".

SYSTEM DESIGN 4.2.5 Person Tracking

Video Dataset (Person Tracking)
Model Training
Person Identification

Initial Person Identification
Predict None
Model Retraining

Remove Directory
User Interface Design

In the input query set, the images of the targeted person to be identified in person identification were assigned here. Therefore, during person identification, sets of images of the person would be cropped from the video and stored in the folders, and the model was retrained. Once 'classname' was person, allocate each point in the bounding box with four variables, such as xA, xB, yA and yB, as shown in Figure 4-2-11.

The class name was the label of the PID directory in the Training directory, as shown in Figure 4-2-15. The reason for sharpening the image was that the person in the video could be blurred, as the resolution of the CCTV camera was not as high as the digital camera. First, join the temporary training directory (tmpTraining) and PID to check if the path exists.

The input of this application was the images of the targeted person and camera videos. Also, the images of the targeted person are displayed in multiple videos as output for users.

SYSTEM IMPLEMENTATION

Methodologies and General Work Procedures
Tools to Use

Hardware Setup
Software Setup

User Requirement
System Performance
Verification Plan
System Operation (with Screenshot)
SYSTEM EVALUATION AND DISCUSSION

The hardware requirements for the person identification application for video surveillance are detailed in Table 5-2-1. Therefore, it is the most suitable language to implement the person identification application for video surveillance. The input was the images of the targeted person and the surveillance videos, while the output was the result of the person's identification with the target identified in the green frames.

In this person identification model, the performance of the CNN model was evaluated based on the percentage accuracy. There were three main processes in this application such as person detection, person tracking and person identification. Person identification will be performed using a CNN model from the training dataset and the query dataset.

If user wants to watch the 4 simultaneous videos displayed again, click 'Show person identification videos', as in Figure 5-6-11. User watches the downloaded person identification videos one by one on the video player, as shown in Figure 5-6-13.

SYSTEM EVALUATION AND DISCUSSION

System Testing
SYSTEM EVALUATION AND DISCUSSION Figure 6-1-5: Frames in the Video Identification 1
SYSTEM EVALUATION AND DISCUSSION 6.2 Use Case Testing

Objectives Evaluation
Project Challenges

CONCLUSION AND RECOMMENDATION

In the upper left and upper right parts of Figure 6-1-5, a person was projected alone in a corner. However, in the lower left and lower right of Figure 6-1-5, a person was projected close to others. However, in the upper right, person occlusion would almost occur if each person were closer to the other.

When the person ID and target ID were the same, the person in the video was cropped with a green frame. When a person was identified as a target, the ID assigned by the tracker was highlighted, and the tracker detected that ID, a green bounding box formed on the person with that ID in the video. After identifying the person in the videos, the person identification videos were displayed simultaneously as output.

So when a person was predicted as None, it would be stored in the PID directory of the tmpTraining directory. Therefore, the detector could miss the person, as shown in the red circles in Figure 6-4-3.

CONCLUSION AND RECOMMENDATION

Project Review and Discussion
Novelties and Contributions
Conclusion with Supportive Remarks
Future Work

The personal identification application tracks the target in multiple videos in a public or private environment. In short, a person has been traced as another person, even though they are both the same person. Because the same person in multiple videos will be assigned a different Person ID in Person ID videos.

In this project, this application solved this problem with the person identification, with the CNN model and the list prediction method. The person identification for video surveillance application reduces the manpower on video surveillance, reduces human error and improves the accuracy of person identification in video. The green bounding box indicates that the person in the bounding box is the target.

First, the person identification process took a long time to complete, as the DeepSORT tracker may need a better GPU to run faster. So given enough time, background removal can increase the accuracy of a person's ID application.

Figure 7-3-1: Person identification in Video Surveillance Application

Le, “Robust person re-identification through combination of metric learning and late fusion techniques”. Wang, “Improving semantic part features for supervised non-local similarity person re-identification,” Tsinghua Sci. Sheng et al., “Globally and Efficiently Robust Pattern Mining for Person Re-identification,” IEEE Internet Things J., vol.

Peng, "Toward fast and kernelized orthogonal discriminant analysis of person re-identification," Pattern Recognit., vol. Lin, "Deep group shuffling dual random walks with label smoothing for person re-identification," IEEE Access, vol. Paulus, "Simple online and realtime tracking with a deep association metric,” in 2017 IEEE International Conference on Image Processing (ICIP), 2017.

30] “Optimization with ADAM and RMSprop in a convolutional neural network (CNN): A case study for Telugu handwritten characters,” Int. Zhu, “Class learning based on CNN feature extraction with optimized softmax and one-class classifiers,” IEEE Access , vol.

APPENDIX

FINAL YEAR PROJECT WEEKLY REPORT

WORK DONE
WORK TO BE DONE
PROBLEMS ENCOUNTERED
SELF EVALUATION OF THE PROGRESS
SELF EVALUATION OF THE PROGRESS Need to increase the speed of implementation
PROBLEMS ENCOUNTERED The DeepSORT tracker is very slow
PROBLEMS ENCOUNTERED
PROBLEMS ENCOUNTERED Not start writing report
WORK DONE Draft FYP2 report
SELF EVALUATION OF THE PROGRESS Be more effective in doing report
WORK DONE FYP2 Report in progress
WORK DONE
WORK TO BE DONE Final check on FYP2 Report
PROBLEMS ENCOUNTERED -
SELF EVALUATION OF THE PROGRESS Submit the report on time

Not to train the model when the new videos are implemented, but to use with the transfer learning method. Connect Visual Studio Code (Frontend) and Jupyter Notebook (Backend) Try to figure out how to remove the background if possible. Need to proceed faster and solve more internal complex problems such as bounding box and UI design.

The videos UI is a bit too complex and complicated to be done as it exceeds 50MB.

PLAGIARISM CHECK RESULT

CHECKLIST

UNIVERSITI TUNKU ABDUL RAHMAN FACULTY OF INFORMATION & COMMUNICATION

TECHNOLOGY (KAMPAR CAMPUS)