Analysis and Mitigation of the Imbalance Impact on an Industrial Image Classification Dataset

Título: Analysis and Mitigation of the Imbalance Impact on an Industrial Image Classification Dataset

Autores: Willian Jeferson Andrade, Andre Eugenio Lazzaretti, Ricardo Eiji Kondo, Heitor Silverio Lopes, Valmir de Oliveira

Resumo: One of the most impacting problems in building machine and deep learning models for automated vision systems in industrial environments is the large variety of products in a real production line, which means that the dataset generated from the systems will be, most probably, imbalanced. Different approaches have been studied in the recent literature to reduce the effects of imbalanced data, still without a recommendation of a more adequate methodology for industrial scenarios. Hence, this paper compares three approaches for reducing the effects of imbalanced classes using a dataset of real images collected in an industrial production line: class removal, weight compensation, and data augmentation. We use a convolutional neural network as the backbone for the classifier of proposed method. Several comparisons are presented, emphasizing the advantages and limitations of each approach. Results show that data augmentation is the most promising approach for the evaluated dataset, improving the results and allowing the real-world application of the proposed method

Palavras-chave: Imbalanced dataset, Data Augmentation, weights compensation, Classification, Convolutional Neural Networks

Páginas: 6

Código DOI: 10.21528/CBIC2023-038

Artigo em pdf: CBIC_2023_paper038.pdf

Arquivo BibTeX: CBIC_2023_038.bib