Coutinho, F.P. , Rezende, S.O. , Rossi, R.G.
Abstract: Transductive Classification through Term Network (TCTN) is an interesting and accurate approach to perform semi-supervised learning based on term networks for text classification. TCTN can surpass the accuracies obtained by transductive classification approach considering texts represented in other types of networks or vector space model. Also, TCTN can surpass the accuracies obtained by inductive supervised learning algorithms. Besides, the term networks in TCTN can have their size decreased while still keeps its classification performance. This implies a less computational cost than other semi-supervised learning approaches based on networks. Originally, TCTN considered just manually defined hyper-parameters. However, even better results can be achieved with a more carefully chosen hyper-parameters values. Thus, in this article, we present a genetic algorithm that (GA) can be used for finding better hyper-parameter values for TCTN. The proposed approach is called GATCTN. Our approach is applied in 25 text collections, and results demonstrate that a GA can be useful together with TCTN for semi-supervised text classification. Besides this contribution, comparisons among hyper-parameters distributions are performed to identify some pattern in its structure. The results indicate that TCTN and GA-TCTN tend to generate a similar set of hyper-parameters. However, GA-TCTN still allows the use of more specific hyper-parameters values being more flexible and practical than TCTN with manually defined parameters. Besides, GA-TCTN obtained better results than TCTN with statistically significant differences.
Keywords: Hyper-parameter Optimization, Genetic Algorithm, Transductive Classification, Semi-supervised Learning, Term Network.
DOI code: 10.21528/lnlm-vol17-no2-art3
PDF file: vol17-no2-art3.pdf
BibTex file: vol17-no2-art3.bib