• Tidak ada hasil yang ditemukan

TITLE PAGE

N/A
N/A
Nguyễn Gia Hào

Academic year: 2023

Membagikan "TITLE PAGE"

Copied!
143
0
0

Teks penuh

I declare that this report entitled “DEVELOPMENT OF PERSONAL IDENTIFICATION APPLICATION FOR VIDEO CONSOLIDATION” is my own work, except as cited in the references. Ng Hui Fuang who gave me this opportunity to design a person identification application for video surveillance.

LIST OF TABLES

LIST OF ABBREVIATIONS

INTRODUCTION

INTRODUCTION

  • Problem Statement and Motivation
  • INTRODUCTION 1.2 Project Scope
  • INTRODUCTION 1.5 Historical Development
    • Report Organisation
  • LITERATURE REVIEW

Therefore, it should work together with the extraction of other functions to increase the effectiveness of the person identification for video surveillance. In Chapter 3, the flowchart and use case diagrams for the personal identification application were shown with a description.

Figure 1-5-1: Person identification from camera. [6]
Figure 1-5-1: Person identification from camera. [6]

LITERATURE REVIEW

LITERATURE REVIEW 2.7 Large Intra-class Variation

According to [7], since there was no proper research based on the feature matching scheme, scientists had proposed the enhanced Bag of Features (BOF) based on the Speeded Up Robust Feature (SURF) algorithm for person re-identification. On the other hand, Table 2-12-1 is the summary of the challenges faced by the researchers and the solutions to solve the problems.

Figure 2-11-1: Positive Ranking List and Negative Ranking List [16]
Figure 2-11-1: Positive Ranking List and Negative Ranking List [16]

SYSTEM METHODOLOGY/APPROACH

SYSTEM METHODOLOGY/APPROACH

  • System Flowchart for the Application
  • Use Case Diagram for Application
    • Use Case Description for Uploading Images
    • Use Case Description for Uploading Videos
    • Use Case Description for Watching Person Identification Videos
  • SYSTEM METHODOLOGY/APPROACH Use Case ID UC03
    • Use Case Description for Downloading Person Identification Videos
    • Activity Diagram for Application
    • Timeline
  • SYSTEM METHODOLOGY/APPROACH 3.4.2 FYP1 Timeline
  • SYSTEM METHODOLOGY/APPROACH 3.4.3 FYP2 Timeline
  • SYSTEM DESIGN

Trigger user presses the 'Show person identification videos' button to view the person identification videos at the same time. Trigger User presses 'Download Video 1' hyperlink or 'Download Video 2' hyperlink or 'Download Video 3' hyperlink or .

Figure 3-2-1: Use Case Diagram for Application
Figure 3-2-1: Use Case Diagram for Application

SYSTEM DESIGN

System Design Concepts

  • Person Detection: YOLOv3
  • Person Tracking: DeepSORT
  • Person Identification: Self-Implemented Model
    • Reason of Implementation of Self-Implemented Model
    • Data Augmentation
    • Convolutional Layers
    • Max Pooling
    • ReLU Activation
    • Dropout
    • Flatten

Coinciding with the name YOLO, the prediction layer has 1 x 1 convolutions as shown in Figure 4-1-2, and the size of the predicted card is the same as the size of the previous element. In the bounding box, it has an x-coordinate (x) and y-coordinate (y) in the upper left corner of the bounding box, width (w) of the bounding box, height (h) of the bounding box, and the confidence score of the box. The Manhattan distance metric, which was considered in the nineteenth century, performs the sum of the absolute differences between the vectors, and then the square root of them, as shown in Figure 4-1-7.

On the other hand, the Euclidean distance performs the sum of the squared differences between the vectors, then their square root, as shown in Figure 4-1-8. After that, it obtains the cosine distance by subtracting the cosine similarity from one [24], as shown in Figure 4-1-9. After that, the images of the person to be identified were divided into the set of questions.

For good practice, the person in the image to be identified should be from different cameras, having a different angle. Because of the outage, packet normalization is not performed, since packet normalization does not perform well in the presence of an outage.

Figure 4-1-2: Architecture of YOLOv3 [21]
Figure 4-1-2: Architecture of YOLOv3 [21]

SYSTEM DESIGN 4.1.3.8 Dense Layer

  • Adam Optimization
  • Sparse Categorical Cross Entropy
  • Softmax
  • Summary of the Model
  • System Design Procedures
    • Person Identification Library
    • Building Own Model for Person Identification

After that, the model was ready to be compiled with the Adam optimizer and Cross categorical sparse entropy. Gradient burst occurs when the model has large parameter updates, caused by large gradient accumulation. This scenario makes the model unable to learn effectively from the training data.

In Figure 4-1-18, the model suffers from performance degradation, especially in the bottom layer, as the update of gradients in the bottom layers is miniscule. On the other hand, the model shown in Figure 4-1-19 suffers performance degradation as the . Before choosing the model with 6 convolutional layers, the model with 4 convolutional layers with a dense layer of 256 dimensions and the model with 5 convolutional layers with a dense layer of 512 dimensions were trained.

However, the CNN model consisting of 6 convolutional layers with a dense layer of 1024 dimensions performed best, the model has no problems with overfitting or underfitting. Another reason for choosing the model with 6 convolutional layers was that the number of training datasets increased as new videos and images of the target were uploaded to this application.

Figure 4-1-17: Summary of Model
Figure 4-1-17: Summary of Model

SYSTEM DESIGN Model with 6

  • Target Prediction
  • Person Detection

After flattening, the dense layer had a dimension of 1024, and the output size will be consistent with the number of classes in the data library. If the person in the figure-of-eight images was predicted as 'None', it would be passed to video tracking, only for image retrieval from one of the videos, and the model would be retrained according to the new dataset in the Training Library. On the other hand, if the person in the eight images was not predicted as 'None', it proceeded to person identification directly without proceeding to person tracking as before.

The final PID was finalized by selecting the PID with the highest frequency in 'target_list'. Person tracking is implemented to further retrieve the bounding box of the person to be trained for person re-identification. It can detect a total of 80 classes as listed in 'coco.names', such as person, bicycle, car, motorcycle, airplane and more.

In video processing, the name list of 'coco.names' was needed for person detection. Therefore, read and load the class list in 'coco.names' into an array "class_names".

SYSTEM DESIGN 4.2.5 Person Tracking

  • Video Dataset (Person Tracking)
  • Model Training
  • Person Identification
    • Initial Person Identification
    • Predict None
    • Model Retraining
  • Remove Directory
  • User Interface Design

In the input query set, the images of the targeted person to be identified in person identification were assigned here. Therefore, during person identification, sets of images of the person would be cropped from the video and stored in the folders, and the model was retrained. Once 'classname' was person, allocate each point in the bounding box with four variables, such as xA, xB, yA and yB, as shown in Figure 4-2-11.

The class name was the label of the PID directory in the Training directory, as shown in Figure 4-2-15. The reason for sharpening the image was that the person in the video could be blurred, as the resolution of the CCTV camera was not as high as the digital camera. First, join the temporary training directory (tmpTraining) and PID to check if the path exists.

The input of this application was the images of the targeted person and camera videos. Also, the images of the targeted person are displayed in multiple videos as output for users.

Figure 4-2-3: Videos and Resized Videos
Figure 4-2-3: Videos and Resized Videos

SYSTEM IMPLEMENTATION

SYSTEM IMPLEMENTATION

  • Methodologies and General Work Procedures
  • Tools to Use
    • Hardware Setup
    • Software Setup
  • User Requirement
  • System Performance
  • Verification Plan
  • System Operation (with Screenshot)
  • SYSTEM EVALUATION AND DISCUSSION

The hardware requirements for the person identification application for video surveillance are detailed in Table 5-2-1. Therefore, it is the most suitable language to implement the person identification application for video surveillance. The input was the images of the targeted person and the surveillance videos, while the output was the result of the person's identification with the target identified in the green frames.

In this person identification model, the performance of the CNN model was evaluated based on the percentage accuracy. There were three main processes in this application such as person detection, person tracking and person identification. Person identification will be performed using a CNN model from the training dataset and the query dataset.

If user wants to watch the 4 simultaneous videos displayed again, click 'Show person identification videos', as in Figure 5-6-11. User watches the downloaded person identification videos one by one on the video player, as shown in Figure 5-6-13.

Table 5-2-1: Hardware Requirements
Table 5-2-1: Hardware Requirements

SYSTEM EVALUATION AND DISCUSSION

  • System Testing
  • SYSTEM EVALUATION AND DISCUSSION Figure 6-1-5: Frames in the Video Identification 1
  • SYSTEM EVALUATION AND DISCUSSION 6.2 Use Case Testing
    • Objectives Evaluation
    • Project Challenges
  • CONCLUSION AND RECOMMENDATION

In the upper left and upper right parts of Figure 6-1-5, a person was projected alone in a corner. However, in the lower left and lower right of Figure 6-1-5, a person was projected close to others. However, in the upper right, person occlusion would almost occur if each person were closer to the other.

When the person ID and target ID were the same, the person in the video was cropped with a green frame. When a person was identified as a target, the ID assigned by the tracker was highlighted, and the tracker detected that ID, a green bounding box formed on the person with that ID in the video. After identifying the person in the videos, the person identification videos were displayed simultaneously as output.

So when a person was predicted as None, it would be stored in the PID directory of the tmpTraining directory. Therefore, the detector could miss the person, as shown in the red circles in Figure 6-4-3.

Figure 6-1-2: Persons Cropped
Figure 6-1-2: Persons Cropped

CONCLUSION AND RECOMMENDATION

  • Project Review and Discussion
  • Novelties and Contributions
  • Conclusion with Supportive Remarks
  • Future Work

The personal identification application tracks the target in multiple videos in a public or private environment. In short, a person has been traced as another person, even though they are both the same person. Because the same person in multiple videos will be assigned a different Person ID in Person ID videos.

In this project, this application solved this problem with the person identification, with the CNN model and the list prediction method. The person identification for video surveillance application reduces the manpower on video surveillance, reduces human error and improves the accuracy of person identification in video. The green bounding box indicates that the person in the bounding box is the target.

First, the person identification process took a long time to complete, as the DeepSORT tracker may need a better GPU to run faster. So given enough time, background removal can increase the accuracy of a person's ID application.

Figure 7-3-1: Person identification in Video Surveillance Application
Figure 7-3-1: Person identification in Video Surveillance Application

Le, “Robust person re-identification through combination of metric learning and late fusion techniques”. Wang, “Improving semantic part features for supervised non-local similarity person re-identification,” Tsinghua Sci. Sheng et al., “Globally and Efficiently Robust Pattern Mining for Person Re-identification,” IEEE Internet Things J., vol.

Peng, "Toward fast and kernelized orthogonal discriminant analysis of person re-identification," Pattern Recognit., vol. Lin, "Deep group shuffling dual random walks with label smoothing for person re-identification," IEEE Access, vol. Paulus, "Simple online and realtime tracking with a deep association metric,” in 2017 IEEE International Conference on Image Processing (ICIP), 2017.

30] “Optimization with ADAM and RMSprop in a convolutional neural network (CNN): A case study for Telugu handwritten characters,” Int. Zhu, “Class learning based on CNN feature extraction with optimized softmax and one-class classifiers,” IEEE Access , vol.

APPENDIX

FINAL YEAR PROJECT WEEKLY REPORT

  • WORK DONE
  • WORK TO BE DONE
  • PROBLEMS ENCOUNTERED
  • SELF EVALUATION OF THE PROGRESS
  • SELF EVALUATION OF THE PROGRESS Need to increase the speed of implementation
  • PROBLEMS ENCOUNTERED The DeepSORT tracker is very slow
  • PROBLEMS ENCOUNTERED
  • PROBLEMS ENCOUNTERED Not start writing report
  • WORK DONE Draft FYP2 report
  • SELF EVALUATION OF THE PROGRESS Be more effective in doing report
  • WORK DONE FYP2 Report in progress
  • WORK DONE
  • WORK TO BE DONE Final check on FYP2 Report
  • PROBLEMS ENCOUNTERED -
  • SELF EVALUATION OF THE PROGRESS Submit the report on time

Not to train the model when the new videos are implemented, but to use with the transfer learning method. Connect Visual Studio Code (Frontend) and Jupyter Notebook (Backend) Try to figure out how to remove the background if possible. Need to proceed faster and solve more internal complex problems such as bounding box and UI design.

The videos UI is a bit too complex and complicated to be done as it exceeds 50MB.

PLAGIARISM CHECK RESULT

CHECKLIST

UNIVERSITI TUNKU ABDUL RAHMAN FACULTY OF INFORMATION & COMMUNICATION

TECHNOLOGY (KAMPAR CAMPUS)

Gambar

Figure 1-1-1: Tracking multiple persons under multiple cameras by application. [1]
Figure 1-5-1: Person identification from camera. [6]
Figure 2-3-1: Difference Between High Inter-Person Distance and Low Inter-Person  Distance
Figure 2-11-1: Positive Ranking List and Negative Ranking List [16]
+7

Referensi

Dokumen terkait

x LIST OF FIGURES Figure 1-1 Ongoing construction of 409 Shaw towers 1 Figure 2-1 Typical drainage system in a building 6 Figure 2-2 A sink equipped with a trap 7 Figure 2-3