A Key Volume Mining Deep Framework for Action Recognition | |
Wangjiang Zhu; Jie Hu; Gang Sun; Xudong Cao; Yu Qiao | |
2016 | |
会议名称 | CVPR2016 |
会议地点 | 美国 |
英文摘要 | Recently, deep learning approaches have demonstrated remarkable progresses for action recognition in videos. Most existing deep frameworks equally treat every volume i.e. spatial-temporal video clip, and directly assign a video label to all volumes sampled from it. However, within a video, discriminative actions may occur sparsely in a few key volumes, and most other volumes are irrelevant to the labeled action category. Training with a large proportion of irrelevant volumes will hurt performance. To address this issue, we propose a key volume mining deep framework to identify key volumes and conduct classification simultaneously. Specifically, our framework is trained is optimized in an alternative way integrated to the forward and backward stages of Stochastic Gradient Descent (SGD). In the forward pass, our network mines key volumes for each action class. In the backward pass, it updates network parameters with the help of these mined key volumes. In addition, we propose “Stochastic out” to model key volumes from multi-modalities, and an effective yet simple “unsupervised key volume proposal” method for high quality volume sampling. Our experiments show that action recognition performance can be significantly improved by mining key volumes, and we achieve state-of-the-art performance on HMDB51 and UCF101 (93.1%). |
收录类别 | EI |
语种 | 英语 |
内容类型 | 会议论文 |
源URL | [http://ir.siat.ac.cn:8080/handle/172644/10023] |
专题 | 深圳先进技术研究院_集成所 |
作者单位 | 2016 |
推荐引用方式 GB/T 7714 | Wangjiang Zhu,Jie Hu,Gang Sun,et al. A Key Volume Mining Deep Framework for Action Recognition[C]. 见:CVPR2016. 美国. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论