A two-stage temporal proposal network for precise action localization in untrimmed video
Wang F(王斐)3; Wang, Guorui2; Du, Yuxuan2; He, Zhenquan2; Jiang Y(姜勇)1
刊名International Journal of Machine Learning and Cybernetics
2021
卷号12期号:8页码:2199-2211
关键词Action detection Correctness discriminator Extended context pooling Temporal context regression
ISSN号1868-8071
产权排序3
英文摘要

In this paper, we propose a two-stage temporal proposal algorithm for the action detection task of long untrimmed videos. In the first stage, we propose a novel prior-minor watershed algorithm for action proposals with precise prior watershed proposal algorithm and minor supplementary sliding window algorithm. Here, we propose the correctness discriminator to fill the proposals that watershed proposal algorithm may omit with the sliding window proposals. In the second stage, an extended context pooling (ECP) is firstly proposed with two modules (internal and context). The context information module of ECP can structure the proposals and enhance the extended features of action proposals. Different level of ECP is introduced to model the action proposal region and make its extended context region more targeted and precise. Then, we propose a temporal context regression network, which adopts a multi-task loss to realize the training of the temporal coordinate regression and the action/background classification simultaneously, and outputs the precise temporal boundaries of the proposals. Here, we also propose prior-minor ranking to balance the effect of the prior watershed proposals and the minor supplementary proposals. On three large scale benchmarks THUMOS14, ActivityNet (v1.2 and v1.3), and Charades, our approach achieves superior performances compared with other state-of-the-art methods and runs over 1020 frames per second (fps) on a single NVIDIA Titan-X Pascal GPU, indicating that our method can efficiently improve the precision of action localization task.

资助项目Foundation of National Natural Science Foundation of China[61973065] ; Fundamental Research Funds for the Central Universities of China[N172608005] ; Fundamental Research Funds for the Central Universities of China[N182612002] ; Fundamental Research Funds for the Central Universities of China[N2026002] ; National Natural Science Foundation of China[61973065]
WOS研究方向Computer Science
语种英语
WOS记录号WOS:000635039400001
资助机构Foundation of National Natural Science Foundation of China under Grant 61973065 ; Fundamental Research Funds for the Central Universities of China under Grant N172608005, N182612002 and N2026002 ; National Natural Science Foundation of China under Grant 61973065
内容类型期刊论文
源URL[http://ir.sia.cn/handle/173321/28738]  
专题工艺装备与智能机器人研究室
通讯作者Wang F(王斐)
作者单位1.Shenyang Institute of Automation Chinese Academy of Sciences, Shenyang 110016, China
2.College of Information Science and Engineering, Northeastern University, Shenyang110819, China
3.Faculty of Robot Science and Engineering, Northeastern University, Shenyang 110169, China
推荐引用方式
GB/T 7714
Wang F,Wang, Guorui,Du, Yuxuan,et al. A two-stage temporal proposal network for precise action localization in untrimmed video[J]. International Journal of Machine Learning and Cybernetics,2021,12(8):2199-2211.
APA Wang F,Wang, Guorui,Du, Yuxuan,He, Zhenquan,&Jiang Y.(2021).A two-stage temporal proposal network for precise action localization in untrimmed video.International Journal of Machine Learning and Cybernetics,12(8),2199-2211.
MLA Wang F,et al."A two-stage temporal proposal network for precise action localization in untrimmed video".International Journal of Machine Learning and Cybernetics 12.8(2021):2199-2211.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace