A Key Volume Mining Deep Framework for Action Recognition
Wangjiang Zhu; Jie Hu; Gang Sun; Xudong Cao; Yu Qiao
2016
会议名称CVPR2016
会议地点美国
英文摘要Recently, deep learning approaches have demonstrated remarkable progresses for action recognition in videos. Most existing deep frameworks equally treat every volume i.e. spatial-temporal video clip, and directly assign a video label to all volumes sampled from it. However, within a video, discriminative actions may occur sparsely in a few key volumes, and most other volumes are irrelevant to the labeled action category. Training with a large proportion of irrelevant volumes will hurt performance. To address this issue, we propose a key volume mining deep framework to identify key volumes and conduct classification simultaneously. Specifically, our framework is trained is optimized in an alternative way integrated to the forward and backward stages of Stochastic Gradient Descent (SGD). In the forward pass, our network mines key volumes for each action class. In the backward pass, it updates network parameters with the help of these mined key volumes. In addition, we propose “Stochastic out” to model key volumes from multi-modalities, and an effective yet simple “unsupervised key volume proposal” method for high quality volume sampling. Our experiments show that action recognition performance can be significantly improved by mining key volumes, and we achieve state-of-the-art performance on HMDB51 and UCF101 (93.1%).
收录类别EI
语种英语
内容类型会议论文
源URL[http://ir.siat.ac.cn:8080/handle/172644/10023]  
专题深圳先进技术研究院_集成所
作者单位2016
推荐引用方式
GB/T 7714
Wangjiang Zhu,Jie Hu,Gang Sun,et al. A Key Volume Mining Deep Framework for Action Recognition[C]. 见:CVPR2016. 美国.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace