Delving into the Effectiveness of Receptive Fields: Learning Scale-Transferrable Architectures for Practical Object Detection
Zhang, Zhaoxiang1,2,4,5; Pan, Cong1,4,5; Peng, Junran3
刊名INTERNATIONAL JOURNAL OF COMPUTER VISION
2022-04-01
卷号130期号:4页码:970-989
关键词Computer vision Object detection Effective receptive fields Hardware acceleration
ISSN号0920-5691
DOI10.1007/s11263-021-01573-6
通讯作者Peng, Junran(pengjunran@huawei.com)
英文摘要Scale-sensitive object detection remains a challenging task, where most of the existing methods could not learn it explicitly and are not robust. Besides, they are less efficient during training or slow during inference, which is not friendly to real-time applications. In this paper, we propose a scale-transferrable architecture for practical object detection based on the analysis of the connection between dilation rate and effective receptive field. Our method firstly predicts a global continuous scale, which is shared by all positions, for each convolution filter of each network stage. Secondly, we average the spatial features and distill the scale from channels to effectively learn the scale. Thirdly, for fast-deployment, we propose a scale decomposition method that transfers the robust fractional scale into the combination of fixed integral scales for each convolution filter, which exploits the dilated convolution. Moreover, to overcome the shortcomings of our method for large-scale object detection, we modify the Feature Pyramid Network structure. Finally, we illustrate the orthogonality role of our method for sampling strategy. We demonstrate the effectiveness of our method on one-stage and two-stage algorithms under different configurations and compare them with different dilated convolution blocks. For practical applications, the training strategy of our method is simple and efficient, avoiding complex data sampling or optimization strategy. During inference, we reduce the latency of the proposed method by using the hardware accelerator TensorRT without extra operation. On the COCO test-dev, our model achieves 41.7% mAP on one-stage detector and 42.5% mAP on two-stage detector based on ResNet-101, and outperforms baselines by 3.2% and 3.1% mAP, respectively.
资助项目Major Project for New Generation of AI[2018AAA0100400] ; NationalNatural Science Foundation of China[61836014] ; NationalNatural Science Foundation of China[U21B 2042]
WOS研究方向Computer Science
语种英语
出版者SPRINGER
WOS记录号WOS:000759289300002
资助机构Major Project for New Generation of AI ; NationalNatural Science Foundation of China
内容类型期刊论文
源URL[http://ir.ia.ac.cn/handle/173211/47954]  
专题自动化研究所_智能感知与计算研究中心
通讯作者Peng, Junran
作者单位1.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
2.Chinese Acad Sci, Hong Kong Inst Sci & Innovat, Ctr Artificial Intelligence & Robot, Hong Kong, Peoples R China
3.Huawei Cloud & AI, Beijing, Peoples R China
4.Univ Chinese Acad Sci, Sch Future Technol, Beijing, Peoples R China
5.Ctr Res Intelligent Percept & Comp, Beijing, Peoples R China
推荐引用方式
GB/T 7714
Zhang, Zhaoxiang,Pan, Cong,Peng, Junran. Delving into the Effectiveness of Receptive Fields: Learning Scale-Transferrable Architectures for Practical Object Detection[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION,2022,130(4):970-989.
APA Zhang, Zhaoxiang,Pan, Cong,&Peng, Junran.(2022).Delving into the Effectiveness of Receptive Fields: Learning Scale-Transferrable Architectures for Practical Object Detection.INTERNATIONAL JOURNAL OF COMPUTER VISION,130(4),970-989.
MLA Zhang, Zhaoxiang,et al."Delving into the Effectiveness of Receptive Fields: Learning Scale-Transferrable Architectures for Practical Object Detection".INTERNATIONAL JOURNAL OF COMPUTER VISION 130.4(2022):970-989.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace