Título: Cardinality and Density Measures and Their Influence to Multi-Label Learning Methods
Autores: Bernardini, Flavia Cristina; Silva, Rodrigo Barbosa da; Rodovalho, Rodrigo Magalhães; Meza, Edwin Benito Mitacc
Resumo: Two main characteristics of multi-label dataset are cardinality and density, related to the number of labels of (each instance of) a multi-label dataset. The relation between these characteristics and multi-label learning performance has been observed with different datasets. However, the difference in domain dataset attributes also interfere on multi-label learning performance. In this work, we used a real dataset named The Million Song Dataset (MSD), available on the internet. A particularly useful characteristic of this dataset is the existence of many labels associated to their instances (songs). We conduct the experiments on datasets processed from MSD, and the results show that both density and cardinality characteristics influence the performance of the multi-label learning methods used in this work. To extend our analysis, we also analyze the results obtained in natural datasets, i.e, datasets available on the internet pre-processed for empirical tests in multi-label learning. Our results show that density characteristic influences more to multi-label learning than cardinality characteristic.
Palavras-chave: Multi-label Learning; Cardinality; Density Measures
Páginas: 19
Código DOI: 10.21528/lmln-vol12-no1-art4
Artigo em PDF: vol12-no1-art4.pdf
Arquivo BibTex: vol12-no1-art4.bib