System Evaluation and Discussion
6.1 Automatic License Plate Recognition .1 Evaluation Criteria
6.1.2 Comparison between Candidate Models
Bachelor of Computer Science (Honours)
Faculty of Information and Communication Technology (Kampar Campus), UTAR
44
CHAPTER 6
Bachelor of Computer Science (Honours)
Faculty of Information and Communication Technology (Kampar Campus), UTAR
45 Table 6.1.2.1 Comparison of training and validation loss between candidate models Candidate
Model Training Loss Validation Loss |Validation Loss – Training Loss|
model_42 0.000398 0.003214 0.002816
model_46 0.000331 0.003062 0.002731
model_49 0.000325 0.002894 0.002569
We can observe that the difference between training and validation loss of each model were relatively small, but model_49 had the least difference. However, it was not valid enough to select the best model based on this performance metric. We still needed to judge their recognizing performance on the unseen datasets.
To evaluate the performance of the candidate models, three separate datasets were used for benchmarking purpose, namely LPR44, LPR45, and Open Environment Dataset. The level of difficulty was increasing from LPR44 to OED, with LPR44 having the best image quality and least noise; while OED having the worst image quality and most noise. The recognition accuracy on different license plate datasets were summarized in Table below.
Table 6.1.2.2 Comparison of prediction accuracy between candidate models across datasets
Model
Prediction Accuracy LPR44
(409 samples) [Low level difficulty]
LPR45 (553 samples) [Medium level difficulty]
Open Environment Dataset (2533 samples) [High level difficulty]
model_42 94.18% 90.78% 75.34%
model_46 94.18% 90.93% 75.11%
model_49 94.18% 91.04 % 75.73%
Bachelor of Computer Science (Honours)
Faculty of Information and Communication Technology (Kampar Campus), UTAR
46 All the three models performed equally well on the easiest dataset, LPR44 with up to 94%
accuracy, while the accuracy slightly dropped when coming to the LPR45 with more blurry images. Noted that although the image quality from OED was considered very poor compared to the training dataset, but all the models still manage to achieve an accuracy level of 75%.
This had proven that the models showed a robust performance because of their ability in handling extreme noises on the image. Since our objective was to train a robust license plate recognition model for well generalization, model_49 was selected as our final model due to its slightly outstanding performance in the difficult dataset LPR45 and OED.
Bachelor of Computer Science (Honours)
Faculty of Information and Communication Technology (Kampar Campus), UTAR
47 6.1.3 Character-Level Accuracy
Despite the prediction accuracy 95.68% of our ALPR on the testing datasets was considered high, it was also significant to study the performance of our model in character-level recognition, because a poor character recognition can eventually lead to low prediction accuracy. Hence, we constructed a confusion matrix to outline the misclassification rate at character-level recognition on the testing dataset.
Figure 6.1.3.1 Confusion matrix of ALPR on testing dataset for character-level recognition We discovered that there were certain pairs of characters more likely to be mislabeled, mainly
“M” - “W” (mislabeled 17 times), “V”- “Y” (mislabeled 13 times), “6” - “8” (mislabeled 25 times), and “8” -“9” (mislabeled 23 times).
Noted that the appearance of the mislabeled characters was visually similar to each other, eg.
“M” -“W” were the inverted versions, “V” - Y” with two slashes joined together, and “6” -
“8” - “9” were made up of circle. Thereby, probably when there was blurry or incomplete character on the license plate image, the contour of characters might not be very clear, and it was easy to trick the model to label each character incorrectly.
Anyway, the overall character-level accuracy of our ALPR up to 98.54% remained satisfactory.
Bachelor of Computer Science (Honours)
Faculty of Information and Communication Technology (Kampar Campus), UTAR
48 6.1.4 Comparison of Proposed Model with Existing Model
Prediction Result
Table 6.1.4.1 Comparison of prediction results between proposed and existing model
Input Image Actual Predicted
Proposed Model Existing Model 1
PLG2052 PLG2052 PLG2052
2
PJP6297 PJP6297 JP6297
3
AHW7228 AHW7228 HW7228
5
WA3704 MA3704 HA3704
6
WPA4316 WPA4316 WPH4316
From the prediction result above, it was observed that both the proposed and existing model were able to correctly recognize the characters on a very clear license plate image (PLG2052).
But, when it came to license plates showing incomplete characters (PJP6297, AHW7228, WPA4316), the existing model failed to read it precisely. The incomplete characters may be treated as unwanted noises that was not belong to the license plate, eg. car logo or plate rivet.
In contrast, the proposed model still can identify all the characters accurately due to its locality and generalization that was early improved through training with diversified and noisy dataset.
However, since there was no clear edge illustrating the character (WA3704) and was unreadable by human, it was reasonable and acceptable that both models failed to identify the character.
Bachelor of Computer Science (Honours)
Faculty of Information and Communication Technology (Kampar Campus), UTAR
49 Summary of Comparison Results
Table 6.1.4.2 Summary of comparison between proposed model and existing model Proposed Model Existing Model
Prediction Accuracy
Testing Set 95.68% 92.44%
LPR44 94.18% 91.55%
LPR45 91.04% 87.67%
Open Environment 75.73% 70.98%
Character-Level Accuracy
Testing Set 98.54% 97.05%
LPR44 98.16% 95.92%
LPR45 96.62% 95.01%
Open Environment 83.48% 80.72%
CNN Parameter Size ~ 5.5 million ~ 2 billion Average Prediction Time < 1 second 5-6 second
Judging from the comparison results above, apparently our proposed CRNN model outperformed the existing model in terms of prediction and character-level accuracy on both the testing and benchmark datasets. Furthermore, since we had smaller parameter size in the CNN layers, our prediction time was significantly improved to less than 1 second.
Bachelor of Computer Science (Honours)
Faculty of Information and Communication Technology (Kampar Campus), UTAR
50 6.2 Catch - Overspeeding Detection