Skip to main content

Research Repository

Advanced Search

SAM-Net: semantic probabilistic and attention mechanisms of dynamic objects for self-supervised depth and camera pose estimation in visual odometry applications.

Yang, Binchao; Xu, Xinying; Ren, Jinchang; Cheng, Lan; Guo, Lei; Zhang, Zhe

Authors

Binchao Yang

Xinying Xu

Lan Cheng

Lei Guo

Zhe Zhang



Abstract

3D scene understanding is an essential research topic in the field of Visual Odometry (VO). VO is usually built under the assumption of a static environment, which does not always hold in real scenarios. Existing works fail to consider the dynamic objects, leading to poor performance. To tackle the aforementioned issues, we propose a self-supervised learning-based VO framework with Semantic probabilistic and Attention Mechanism, SAM-Net, which can jointly learn the single view depth, the ego motion of camera and object detection. For depth estimation, semantic probabilistic fusion mechanism is employed to detect the dynamic objects and generate the semantic probability map as a prior before feeding it to the network to generate a more refined depth map, and attention mechanism is explored to enhance perception ability in spatial and channel view. For pose estimation, we present a novel PoseNet with the atrous separable convolution to expand receptive field. And the photometric consistency loss is employed to alleviate the impact of large rotations. Intensive experiments on the KITTI dataset demonstrate that the proposed approach achieves excellent performance in terms of pose and depth accuracy.

Citation

YANG, B., XU, X., REN, J., CHENG, L. GUO, L. and ZHANG, Z. 2022. SAM-Net: semantic probabilistic and attention mechanisms of dynamic objects for self-supervised depth and camera pose estimation in visual odometry applications. Pattern recognition letters [online], 153, pages 126-135. Available from: https://doi.org/10.1016/j.patrec.2021.11.028

Journal Article Type Article
Acceptance Date Nov 30, 2021
Online Publication Date Dec 3, 2021
Publication Date Jan 31, 2022
Deposit Date Aug 9, 2022
Publicly Available Date Dec 4, 2022
Journal Pattern Recognition Letters
Print ISSN 0167-8655
Electronic ISSN 1872-7344
Publisher Elsevier
Peer Reviewed Peer Reviewed
Volume 153
Pages 126-135
DOI https://doi.org/10.1016/j.patrec.2021.11.028
Keywords Artificial intelligence; Computer vision and pattern recognition; Signal processing; Software; Visual odometry; Self-supervised deep learning; Object detection; Semantic probabilistic map; Attention mechanism
Public URL https://rgu-repository.worktribe.com/output/1545070

Files




You might also like



Downloadable Citations