监控场景下的视频浓缩与行人属性分类研究

CORC > 自动化研究所 > 中国科学院自动化研究所 > 毕业生 > 博士学位论文

题名	监控场景下的视频浓缩与行人属性分类研究
作者	朱建清
学位类别	工学博士
答辩日期	2015-05-27
授予单位	中国科学院大学
授予地点	中国科学院自动化研究所
导师	李子青 ; 廖胜才
关键词	视频浓缩联合视频浓缩行人属性分类属性相互关系模型卷积神经网络 Video Condensation Joint Video Condensation Pedestrian Attribute Attribute Interaction Model Convolutional Neural Network
其他题名	Research on Video Condensation and Pedestrian Attribute Classification in Surveillance
学位专业	计算机应用技术
中文摘要	随着世界范围内对社会公共安全和公众保护需要的日益增长，越来越多的摄像头被安装在学校、医院、街道、住宅小区、公园等场所。面对全天候实时拍摄的海量监控视频数据，如何进行高效地管理和存储，并且自动快速地提取出人们想要的信息，成为亟待解决的重要问题。本文针对智能视频监控中的视频浓缩和行人属性分类这两个关键问题，进行了深入的研究，主要研究成果和贡献如下： 1. 基于在线视频浓缩框架，提出一个高速视频浓缩的实现方案。使用多线程编程技术，将视频浓缩算法分解成三个步骤，分别由三个线程并行执行: 1）前景目标产生线程，用来提取视频中的运动目标序列和背景图片；2）前景目标重排线程，采用在线重排算法对运动目标序列进行时域重排，降低计算复杂度；3）前景目标缝合线程，利用泊松编辑算法缝合前景目标和背景图片。为了提高速度，利用GPU（Graphic Processing Unit）加速基于尺度不变的局部三元模式（Scale Invariant Local Ternary Pattern, SILTP）物体分割算法。为了节约内存，提出了一种在线的背景生成方法。为了增强内存管理能力，在线程之间引入生产-消费者模型。 2. 提出了面向摄像机网络的多路视频联合浓缩算法。设计了一个关于前景目标重排的损失函数，它引入了目标相似度和多摄像头下前景目标之间的时域限制，能全局地重排运动目标，使所生成的多路浓缩视频更便于用户浏览。前景目标重排的损失函数包括：1）遮挡损失项――考虑同一摄像头下目标之间的遮挡损失。为了更好地平衡遮挡损失项和时序错乱损失项之间的比重，设计了一种基于遮挡程度的相对遮挡损失，使得遮挡损失项和时序错乱损失项的值域相当，且与输入视频的分辨率无关。2）时序损失项――不仅考虑同一个摄像头下目标之间的时序错乱损失，同时也考虑不同摄像头下运动目标之间的错乱时序损失。为了更好地保证目标之间的时序，时序错乱损失项引入了目标相似度，使得同一摄像头下相似度低的运动目标之间的时序和不同摄像头下相似度高的运动目标之间的时序得到更大的重视。 3. 提出了一个基于属性相互关系建模的行人属性分类算法。每个属性的分类分数由两个部分组成：第一部分是该属性自身的分类分数；第二部分是一个回归器的输出分数，该回归器从其它属性分数训练得到。这样防止了属性因为和自身最相关，而忽略了属性之间相互关系的作用。为了促进行人属性分类的研究，本文还构建了带有属性标注的行人库APiS（Attributed Pedestrians in Surveillance）、测试协议和基准算法。APiS数据库包含3,661张行人图片，每张图片带有11个二类属性。基准算法对每个属性独立地进行预测，不考虑属性间的相互关系，每个属性的分类器都单独使用Gentle Adaboost算法在颜色和纹理特征中挑选弱分类器，组合成强分类器。实验表明，在APiS数据库上，所提出的基于属性相互关系建模的行人属性分类算法比基准算法有更好的分类结果。 4. 提出了基于多标签卷积神经网络（Multi-Label Convolutional Neural Network, MLCNN）的行人属性分类算法。MLCNN把属性分类问题转化为多标签分类问题，将多标签损失函数引入卷积神经...
英文摘要	As the huge demand for the public security and protection, more and more surveillance cameras are installed in the places of public activities, such as school, hospital, business street, residential block and park, etc. This produces massive surveillance videos and presents formidable challenges to video browsing, storage and retrieval. This thesis focuses on video condensation and pedestrian attribute classification. The main works and contributions of this thesis are summarized as follows. 1. A fast video condensation solution based on the online video condensation framework is proposed. The solution mainly includes the multi-thread design and the accelerating strategy. The video condensation task is broken down into three parallel steps: 1) the tube (object sequence) generation step, including the online background generation, moving object segmentation and sticky tracking; 2) the tube rearrangement step, using the online content-aware tube filling algorithm to rearrange tubes' appearing time labels; 3) the object stitching step, stitching rearranged tubes into background images to produce condensation video frames. The three steps are parallelly implemented by using the multi-thread technique. Moreover, a number of techniques are introduced to enhance the system on speed and memory consumption, including: 1) a GPU(Graphic Processing Unit) accelerated scale invariant local ternary pattern (SILTP) feature based background subtraction algorithm is used for moving object segmentation; 2) an online background generation method is applied to generate a constantly updated background image, consuming low memory cost; 3) an effective memory buffer design based on the producer-consumer model is used to control the memory balance between different multi-thread modules. 2. A novel multi-channel joint video synopsis (JVS) method is proposed. The traditional video synopsis (TVS) is aimed for videos captured by a single camera, thus can not present the overall activities of moving objects in a camera network. To solve this issue, in the JVS, a global energy function used for tube rearrangement is designed. The global energy function includes occlusion cost and chronological disorder cost terms. The occlusion cost term based on the relative occlusion degree is designed for tubes extract from the same camera view, which makes the cost range is considerable with the chronological disorder cost range, thus it is independent with the resolution of the input video. ...
语种	中文
其他标识符	201218014629104
内容类型	学位论文
源URL	[http://ir.ia.ac.cn/handle/173211/6707]
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	朱建清. 监控场景下的视频浓缩与行人属性分类研究[D]. 中国科学院自动化研究所. 中国科学院大学. 2015.

个性服务

查看访问统计

相关权益政策

暂无数据

收藏/分享

所有评论 (0)

[发表评论/异议/意见]

暂无评论

评论
权益异议
反馈意见

评注功能仅针对注册用户开放，请您登录

您对该条目有什么异议，请向管理员反馈。
内容：
Email：	*
单位:
验证码：	刷新

您在知识库使用过程中有什么好的想法或者建议可以反馈给我们。
标题：	*
内容：
Email：	*
验证码：	刷新

相关链接

CORC

联系我们