Using the Kullback-Leibler Divergence and Kolmogorov-Smirnov Test to Select Input Sizes to the Fault Diagnosis Problem Based on a CNN Model

Rodrigo P. Monteiro orcid, Carmelo J. A. Bastos-Filho orcid, Mariela Cerrada orcid, Diego R. Cabrera orcid & René V. Sánchez orcid

Abstract: Choosing a suitable size for signal representations, e.g., frequency spectra, in a given machine learning problem is not a trivial task. It may strongly affect the performance of the trained models. Many solutions have been proposed to solve this problem. Most of them rely on designing an optimized input or selecting the most suitable input according to an exhaustive search. In this work, we used the Kullback-Leibler Divergence and the Kolmogorov-Smirnov Test to measure the dissimilarity among signal representations belonging to equal and different classes, i.e., we measured the intraclass and interclass dissimilarities. Moreover, we analyzed how this information relates to the classifier performance. The results suggested that both the interclass and intraclass dissimilarities were related to the model accuracy since they indicate how easy a model can learn discriminative information from the input data. The highest ratios between the average interclass and intraclass dissimilarities were related to the most accurate classifiers. We can use this information to select a suitable input size to train the classification model. The approach was tested on two data sets related to the fault diagnosis of reciprocating compressors.

Keywords: Deep learning, Kullback-Leibler Divergence, Kolmogorov-Smirnov Test, Input Size Selection.

DOI code: 10.21528/lnlm-vol18-no2-art2

PDF file: vol18-no2-art2.pdf

BibTex file: vol18-no2-art2.bib