Viana, F.X., Araujo, G.M., Pinto, M.F., Colares, J., Haddad, D.B.
Abstract: In the last decades, current trends in autonomous navigation have demonstrated an increased use of computational vision over traditional techniques. This relies on the fact that most of the spaces are designed for human navigation. As a result, they are filled with visual cues. In this sense, visual recognition is an essential ability to avoid obstacles when an autonomous vehicle interacts with the real world. Data collection using Unmanned Aerial Vehicles (UAVs) navigating in a real-world scenario is a high-cost and time-expensive activity. For this reason, one of the most valuable assets of technology companies is a database containing locations and interactions. One solution to this problem is the adoption of a photo-realistic 3D simulator as a data source. Using this resource, it is possible to gather a significant amount of data. Therefore, this research creates a dataset for instance segmentation using images from a frontal UAV camera navigating in a 3D simulator. This work applies a state-of-the-art deep learning technique, the Mask-RCNN. The architecture takes an image input and predicts per-pixel instance segmentation. Experimental results showed that Mask RCNN has superior performance in our dataset when refining a model trained using COCO dataset. Besides, the proposed methodology presents a good generalization capability due to the promising results in real-world data.
Keywords: Computer Vision, Deep Learning, Instance Segmentation, 3D Simulator, Aerial Images.
DOI code: 10.21528/lnlm-vol18-no1-art3
PDF file: vol18-no1-art3.pdf
BibTex file: vol18-no1-art3.bib