Do We Train on Test Data?
The Impact of Near-Duplicates on License Plate Recognition

Rayson Laroca^＊

Valter Estevam^＊^,†

Alceu S. Britto Jr.^‡

Rodrigo Minetto^§

David Menotti^＊

^＊Federal University of Paraná, Curitiba, Brazil		^†Federal Institute of Paraná, Irati, Brazil
^‡Pontifical Catholic University of Paraná, Curitiba, Brazil		^§Federal University of Technology-Paraná, Curitiba, Brazil

IJCNN 2023

Examples of near-duplicates in the training and test sets of the AOLP and CCPD datasets, which are by far the two most popular datasets in the License Plate Recognition (LPR) literature. The top row shows license plates cropped and rectified from images in the training sets, while the bottom row shows license plates cropped and rectified from their nearest neighbors in the respective test set. We show three image pairs for each dataset representing the 10th, 50th and 90th percentiles based on their Euclidean distance in pixel space.

Abstract

This work draws attention to the large fraction of near-duplicates in the training and test sets of datasets widely adopted in License Plate Recognition (LPR) research. These duplicates refer to images that, although different, show the same license plate. Our experiments, conducted on the two most popular datasets in the field, show a substantial decrease in recognition rate when six well-known models are trained and tested under fair splits, that is, in the absence of duplicates in the training and test sets. Moreover, in one of the datasets, the ranking of models changed considerably when they were trained and tested under duplicate-free splits. These findings suggest that such duplicates have significantly biased the evaluation and development of deep learning-based models for LPR. The list of near-duplicates we have found and proposals for fair splits are publicly available for further research.

Paper

	Rayson Laroca, Valter Estevam, Alceu S. Britto Jr., Rodrigo Minetto, David Menotti Do we train on test data? The impact of near-duplicates on license plate recognition International Joint Conference on Neural Networks (IJCNN), pp. 1-8, June 2023.

	In summary, this paper has two main contributions: We unveil the presence of near-duplicates in the training and test sets of datasets widely adopted in the ALPR literature. Our analysis, using the AOLP and CCPD datasets, shows the impact of such duplicates on the performance assessment of six well-known OCR models applied to LPR. We create and release fair splits for the AOLP and CCPD datasets where there are no duplicates in the training and test sets, and the key characteristics of the original splits are preserved as much as possible.
[IEEE Xplore] [arXiv] [BibTeX] [List of Duplicates] [Original & Fair Splits]

Related Work

This work is inspired by [1], where duplicates in the CIFAR-10 and CIFAR-100 datasets were identified, and motivated by recent studies that indicated the existence of bias in the ALPR context. For example, in [2], the authors observed significant drops in LPR performance when training and testing state-of-the-art OCR models in a leave-one-dataset-out experimental setup. As another example, the Name That Dataset! experiments carried out in [3] suggested that each LPR dataset has a unique and identifiable ''signature,'' as a lightweight classification model could predict the source dataset of a license plate image at levels significantly better than chance.

[1] - B. Barz and J. Denzler, “Do we train on test data? Purging CIFAR of near-duplicates,” Journal of Imaging, vol. 6, no. 6, p. 41, 2020. [MDPI] [arXiv]

[2] - R. Laroca, E. V. Cardoso, D. R. Lucio, V. Estevam, and D. Menotti, “On the cross-dataset generalization in license plate recognition,” in International Conference on Computer Vision Theory and Applications (VISAPP), pp. 166-178, Feb 2022. [SciTePress] [arXiv]

[3] - R. Laroca, M. Santos, V. Estevam, E. Luz, and D. Menotti, “A first look at dataset bias in license plate recognition,” in Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 234-239, Oct 2022. [IEEE Xplore] [arXiv]

Acknowledgments

This work was supported in part by the Coordination for the Improvement of Higher Education Personnel (CAPES), and in part by the National Council for Scientific and Technological Development (CNPq). The Quadro RTX 8000 GPU used for this research was donated by NVIDIA.