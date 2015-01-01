Abstract

This paper proposes an improved Deeplabv3+ model for semantic segmentation of urban scenes targeting autonomous driving applications. A high-quality semantic segmentation dataset is constructed from 2,967 manually labeled aerial images captured at 200m height with a 5-eye camera. The images contain 5 classes - buildings, vegetation, ground, lake and playgrounds. The improved Deeplabv3+ network enriches high-level semantics by replacing max pooling with depthwise separable convolutions. Dilated convolutions extract multi-scale features to avoid overfitting. Experiments demonstrate that the model achieves an overall mean IoU of 0.87 on the test set, with IoU scores of 0.90, 0.92 and 0.94 on buildings, vegetation and water respectively. The model shows promising results for extracting semantic information from complex urban environments to support navigation for autonomous vehicles.

Language: en