MSL3D: 3D object detection from monocular, stereo and point cloud for autonomous driving
Chen WY(陈文玉)1,2,3,4,5; Li PX(李培玄)1,2,3,4,5; Zhao HC(赵怀慈)2,3,4,5
刊名Neurocomputing
2022
卷号494页码:23-32
关键词3D object detection Automatic driving Multi-sensor fusion
ISSN号0925-2312
产权排序1
英文摘要

In this paper, we propose a novel deep architecture by combining multiple sensors for 3D object detection, named MSL3D. While recently LiDAR-Camera methods introduce additional semantic cues, working with fewer false detections, there is still a performance gap compared LiDAR-only methods. We argue that this gap is caused for two reasons: 1) the 3D spherical receptive fields of the set abstraction of the point clouds are not aligned with the 2D pixel-level receptive fields of the image. 2) the premature introduction of image information makes it is difficult to apply data augmentation both LiDAR and image synchronously. For the first problem, we extend 3D set abstraction to a 2D set abstraction that can transform the 2D image features to the 3D sphere to unify the receptive field of multi-modal data. For the second problem, we design a novel two-stage 3D detection framework that employs the LiDAR-only backbone in the first stage to estimate high-recall and high-quality proposals and then integrates the image and point clouds information for box refinement and confidence prediction. Besides, we add two auxiliary networks to effectively learn image features and point cloud features when using different multi-modal data augmentation strategies synchronously. Moreover, we design a consistency-structure generator using stereo images to determine whether any of a point in the 3D space belongs to the contour of the object, thereby supplementing the sparse point cloud information. Extensive experiments on the popular KITTI 3D objects detection dataset show that our proposed MSL3D achieves better performance comparing with other LiDAR-Only or LiDAR-Camera fusion approaches.

语种英语
内容类型期刊论文
源URL[http://ir.sia.cn/handle/173321/30990]  
专题沈阳自动化研究所_光电信息技术研究室
通讯作者Zhao HC(赵怀慈)
作者单位1.University of Chinese Academy of Sciences, Beijing, China
2.Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang, China
3.Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, China
4.Key Lab of Image Understanding and Computer Vision, Liaoning Province, Shenyang, China
5.Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, China
推荐引用方式
GB/T 7714
Chen WY,Li PX,Zhao HC. MSL3D: 3D object detection from monocular, stereo and point cloud for autonomous driving[J]. Neurocomputing,2022,494:23-32.
APA Chen WY,Li PX,&Zhao HC.(2022).MSL3D: 3D object detection from monocular, stereo and point cloud for autonomous driving.Neurocomputing,494,23-32.
MLA Chen WY,et al."MSL3D: 3D object detection from monocular, stereo and point cloud for autonomous driving".Neurocomputing 494(2022):23-32.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace