Binchao Yang
SAM-Net: semantic probabilistic and attention mechanisms of dynamic objects for self-supervised depth and camera pose estimation in visual odometry applications.
Yang, Binchao; Xu, Xinying; Ren, Jinchang; Cheng, Lan; Guo, Lei; Zhang, Zhe
Authors
Xinying Xu
Professor Jinchang Ren j.ren@rgu.ac.uk
Professor of Computing Science
Lan Cheng
Lei Guo
Zhe Zhang
Abstract
3D scene understanding is an essential research topic in the field of Visual Odometry (VO). VO is usually built under the assumption of a static environment, which does not always hold in real scenarios. Existing works fail to consider the dynamic objects, leading to poor performance. To tackle the aforementioned issues, we propose a self-supervised learning-based VO framework with Semantic probabilistic and Attention Mechanism, SAM-Net, which can jointly learn the single view depth, the ego motion of camera and object detection. For depth estimation, semantic probabilistic fusion mechanism is employed to detect the dynamic objects and generate the semantic probability map as a prior before feeding it to the network to generate a more refined depth map, and attention mechanism is explored to enhance perception ability in spatial and channel view. For pose estimation, we present a novel PoseNet with the atrous separable convolution to expand receptive field. And the photometric consistency loss is employed to alleviate the impact of large rotations. Intensive experiments on the KITTI dataset demonstrate that the proposed approach achieves excellent performance in terms of pose and depth accuracy.
Citation
YANG, B., XU, X., REN, J., CHENG, L. GUO, L. and ZHANG, Z. 2022. SAM-Net: semantic probabilistic and attention mechanisms of dynamic objects for self-supervised depth and camera pose estimation in visual odometry applications. Pattern recognition letters [online], 153, pages 126-135. Available from: https://doi.org/10.1016/j.patrec.2021.11.028
Journal Article Type | Article |
---|---|
Acceptance Date | Nov 30, 2021 |
Online Publication Date | Dec 3, 2021 |
Publication Date | Jan 31, 2022 |
Deposit Date | Aug 9, 2022 |
Publicly Available Date | Dec 4, 2022 |
Journal | Pattern Recognition Letters |
Print ISSN | 0167-8655 |
Electronic ISSN | 1872-7344 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 153 |
Pages | 126-135 |
DOI | https://doi.org/10.1016/j.patrec.2021.11.028 |
Keywords | Artificial intelligence; Computer vision and pattern recognition; Signal processing; Software; Visual odometry; Self-supervised deep learning; Object detection; Semantic probabilistic map; Attention mechanism |
Public URL | https://rgu-repository.worktribe.com/output/1545070 |
Files
YANG 2022 SAM-Net (AAM)
(1.4 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by-nc-nd/4.0/
You might also like
Two-click based fast small object annotation in remote sensing images.
(2024)
Journal Article
Prompting-to-distill semantic knowledge for few-shot learning.
(2024)
Journal Article
Detection-driven exposure-correction network for nighttime drone-view object detection.
(2024)
Journal Article
Feature aggregation and region-aware learning for detection of splicing forgery.
(2024)
Journal Article
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search