Comparison between Candidate Models - Automatic License Plate Recognition .1 Evaluation Crite

System Evaluation and Discussion

6.1 Automatic License Plate Recognition .1 Evaluation Criteria

6.1.2 Comparison between Candidate Models

Bachelor of Computer Science (Honours)

Faculty of Information and Communication Technology (Kampar Campus), UTAR

CHAPTER 6

Bachelor of Computer Science (Honours)

Faculty of Information and Communication Technology (Kampar Campus), UTAR

45 Table 6.1.2.1 Comparison of training and validation loss between candidate models Candidate

Model Training Loss Validation Loss |Validation Loss – Training Loss|

model_42 0.000398 0.003214 0.002816

model_46 0.000331 0.003062 0.002731

model_49 0.000325 0.002894 0.002569

We can observe that the difference between training and validation loss of each model were relatively small, but model_49 had the least difference. However, it was not valid enough to select the best model based on this performance metric. We still needed to judge their recognizing performance on the unseen datasets.

To evaluate the performance of the candidate models, three separate datasets were used for benchmarking purpose, namely LPR44, LPR45, and Open Environment Dataset. The level of difficulty was increasing from LPR44 to OED, with LPR44 having the best image quality and least noise; while OED having the worst image quality and most noise. The recognition accuracy on different license plate datasets were summarized in Table below.

Table 6.1.2.2 Comparison of prediction accuracy between candidate models across datasets

Model

Prediction Accuracy LPR44

(409 samples) [Low level difficulty]

LPR45 (553 samples) [Medium level difficulty]

Open Environment Dataset (2533 samples) [High level difficulty]

model_42 94.18% 90.78% 75.34%

model_46 94.18% 90.93% 75.11%

model_49 94.18% 91.04 % 75.73%

Bachelor of Computer Science (Honours)

Faculty of Information and Communication Technology (Kampar Campus), UTAR

46 All the three models performed equally well on the easiest dataset, LPR44 with up to 94%

accuracy, while the accuracy slightly dropped when coming to the LPR45 with more blurry images. Noted that although the image quality from OED was considered very poor compared to the training dataset, but all the models still manage to achieve an accuracy level of 75%.

This had proven that the models showed a robust performance because of their ability in handling extreme noises on the image. Since our objective was to train a robust license plate recognition model for well generalization, model_49 was selected as our final model due to its slightly outstanding performance in the difficult dataset LPR45 and OED.

Bachelor of Computer Science (Honours)

Faculty of Information and Communication Technology (Kampar Campus), UTAR

47 6.1.3 Character-Level Accuracy

Despite the prediction accuracy 95.68% of our ALPR on the testing datasets was considered high, it was also significant to study the performance of our model in character-level recognition, because a poor character recognition can eventually lead to low prediction accuracy. Hence, we constructed a confusion matrix to outline the misclassification rate at character-level recognition on the testing dataset.

Figure 6.1.3.1 Confusion matrix of ALPR on testing dataset for character-level recognition We discovered that there were certain pairs of characters more likely to be mislabeled, mainly

“M” - “W” (mislabeled 17 times), “V”- “Y” (mislabeled 13 times), “6” - “8” (mislabeled 25 times), and “8” -“9” (mislabeled 23 times).

Noted that the appearance of the mislabeled characters was visually similar to each other, eg.

“M” -“W” were the inverted versions, “V” - Y” with two slashes joined together, and “6” -

“8” - “9” were made up of circle. Thereby, probably when there was blurry or incomplete character on the license plate image, the contour of characters might not be very clear, and it was easy to trick the model to label each character incorrectly.

Anyway, the overall character-level accuracy of our ALPR up to 98.54% remained satisfactory.

Bachelor of Computer Science (Honours)

Faculty of Information and Communication Technology (Kampar Campus), UTAR

48 6.1.4 Comparison of Proposed Model with Existing Model

Prediction Result

Table 6.1.4.1 Comparison of prediction results between proposed and existing model

Input Image Actual Predicted

Proposed Model Existing Model 1

PLG2052 PLG2052 PLG2052

PJP6297 PJP6297 JP6297

AHW7228 AHW7228 HW7228

WA3704 MA3704 HA3704

WPA4316 WPA4316 WPH4316

From the prediction result above, it was observed that both the proposed and existing model were able to correctly recognize the characters on a very clear license plate image (PLG2052).

But, when it came to license plates showing incomplete characters (PJP6297, AHW7228, WPA4316), the existing model failed to read it precisely. The incomplete characters may be treated as unwanted noises that was not belong to the license plate, eg. car logo or plate rivet.

In contrast, the proposed model still can identify all the characters accurately due to its locality and generalization that was early improved through training with diversified and noisy dataset.

However, since there was no clear edge illustrating the character (WA3704) and was unreadable by human, it was reasonable and acceptable that both models failed to identify the character.

Bachelor of Computer Science (Honours)

Faculty of Information and Communication Technology (Kampar Campus), UTAR

49 Summary of Comparison Results

Table 6.1.4.2 Summary of comparison between proposed model and existing model Proposed Model Existing Model

Prediction Accuracy

Testing Set 95.68% 92.44%

LPR44 94.18% 91.55%

LPR45 91.04% 87.67%

Open Environment 75.73% 70.98%

Character-Level Accuracy

Testing Set 98.54% 97.05%

LPR44 98.16% 95.92%

LPR45 96.62% 95.01%

Open Environment 83.48% 80.72%

CNN Parameter Size ~ 5.5 million ~ 2 billion Average Prediction Time < 1 second 5-6 second

Judging from the comparison results above, apparently our proposed CRNN model outperformed the existing model in terms of prediction and character-level accuracy on both the testing and benchmark datasets. Furthermore, since we had smaller parameter size in the CNN layers, our prediction time was significantly improved to less than 1 second.

Bachelor of Computer Science (Honours)

Faculty of Information and Communication Technology (Kampar Campus), UTAR

50 6.2 Catch - Overspeeding Detection

Dalam dokumen DECLARATION OF ORIGINALITY (Halaman 59-65)