基于非负矩阵分解的图像聚类和标注方法研究

CORC > 沈阳自动化研究所 > 中国科学院沈阳自动化研究所 > 机器人学研究室

题名	基于非负矩阵分解的图像聚类和标注方法研究
作者	李冰锋
学位类别	博士
答辩日期	2016-09-29
授予单位	中国科学院沈阳自动化研究所
导师	唐延东
关键词	非负矩阵分解几何结构保持图正则化多视图学习图像标注
其他题名	Image Clustering and Annotation based on Non-negative Matrix Factorization
学位专业	模式识别与智能系统
中文摘要	非负矩阵分解(Non-negative Matrix Factorization, NMF)是指将一个非负矩阵分解为两个非负因子矩阵乘积的形式。作为一种局部的、稀疏的数据表达方法，非负矩阵分解在信息处理领域获得了广泛的应用，目前仍是机器学习和计算机视觉等方向的重要的研究课题。本文以非负矩阵分解为研究对象，重点对非负矩阵分解的图及超图的正则化方法进行了深入研究，部分解决了现有算法在处理具有非线性分布形式的样本数据时所存在的不足。本文的主要研究工作如下：(1) 提出了一种几何结构保持的非负矩阵分解算法。该算法将原始样本分布中近邻样本点间的相似性关系和远距离样本点间的互斥性关系以图正则化的方式进行了描述，并将其引入到非负矩阵分解框架，进而提高了算法对流形结构的保持能力。相对于标准的非负矩阵分解算法，我们提出的算法充分利用了样本分布的先验知识，因此可以获得一种更好低维数据表达方式。为了提高几何结构保持的非负矩阵分解算法对噪声点或者离群点的鲁棒性，我们提出用L2,1范数来计算矩阵分解过程中的误差项，相对于L2范数，基于L2,1范数数据重构方法弱化了离群点或噪声点的误差项在整个误差结果中所占的比重，大大减弱离群点或噪声点对几何结构保持的非负矩阵分解结果的影响。在多个测试集上的试验结果验证了我们的想法。与其他方法的对比，显示出鲁棒性几何结构保持的非负矩阵分解算法对噪声样本的具有较强的鲁棒性。(2) 基于图模型的正则化方法虽然可以有效描述样本间两两的序对关系，但在描述样本间的多元甚至更为复杂的相互关系相对比较乏力。为此，我们将多超图正则化的方法引入到了非负矩阵分解框架。相对基于图的正则化方法，超图正则化方法可有效的描述样本间复杂的多元关系和高阶信息。另外，为了避免单一超图在刻画样本分布的流形结构时所引起的过拟合现象，我们用事先定义的超图集中不同超图间凸组合来近似逼近样本分布的本质流形结构。基于多超图正则化的非负矩阵分解大大提高了其对数据关系描述的完整性和准确性，增强了算法的聚类性能。(3) 提出了一种基于超图直推非负矩阵分解的图像标注算法。该算法首先采用基于监督学习的超图模型来描述样本的低层视觉特征与高层语义标签间的相互关系，并通过超图的正则化方法来对样本的语义标签进行预测。为控制语义标签的预测误差，我们还把语义标签的直推正则项引入了非负矩阵分解框架，从而实现了对样本语义标签的预测和误差控制。为了提高图像标注的精度，我们提出了上述标注算法的改进方法。首先，为了有效利用了异质样本特征间的冗余信息，我们将基于单视图特征的图像标注算法扩展到了多视图空间，并对矩阵分解后各视图上的系数矩阵实施了一致性约束来确保其物理意义的相关性；其次，当有新的训练样本或标注信息被添加到样本库时，为了避免对预测模型进行重新学习所带来的负担，我们提出查询样本驱动的图像标注方法，即，对某查询样本仅采用其k近邻的训练进行标签预测的方法，这大大简化了模型的学习难度。在图像标注数据库上的试验结果表明，相对于其他算法，我们提出的图像标注算法具有良好的标注性能和参数的鲁棒性。
英文摘要	Non-negative matrix factorization (NMF) refers to the approximation of a non-negative matrix with the product of two sub-matrices. As a local, sparse data representation method, NMF has been widely used in the information processing field, and it is still an important research topic of machine learning and computer vision. In my dissertation, the research focuses on the regularization method of NMF based on graph and hypergraph models for solving the problems in current algorithms which cannot describe the non-linear distribution of data samples. The achievements in this dissertation include following works: (1) A new non-negative matrix factorization algorithm with geometric structure preserving is proposed, which characterizes the similarity relationship between neighboring samples and the repulsion relationship between distant samples with graph regularization method, respectively. With incorporation of the local and distant structure preservation regularization term into the standard NMF framework, our algorithm can discover a low-dimensional embedding subspace with the nature of structure preservation. For processing effectively noisy data, the data reconstruction term of NMF is characterized with L2,1-norm instead of the traditional L2-norm. Compared with L2-norm, the data reconstruction term based on L2,1-norm alleviates the effect of noise corrupted sample or outliers on matrix factorization process. Our experimental results on some facial image datasets clustering show significant performance improvement of our robust structure preserving non-negatie matrix method on dimensionality reduction. The comparisons with the state-of-the-arts methods show the effectiveness and robustness of our method on noise corrupted samples. (2) Graph regularization method can effectively depict the pairwise relationship between data samples, however it is not possible for graph regularization method to describe the pluralism and even more complex relationship among data samples. To this end, we introduce the multiple hypergraph regularization term into the NMF framework. Compared with graph regularization method, hypergraph regularization term efficiently characterizes the complex pluralism relationship and higher-order information. In order to avoid the over-fitting caused by the manifold characterization by one hypergraph, we apply the convex combination of pre-defined hypergraph set to approximate the manifold structure of data samples distribution. Multiple hypergraph regularized NMF greatly enhances the completeness and accuracy of the data description and improves the clustering performance of our algorithm. (3) With hypergraph transduction non-negative matrix factorization, an automatic image annotation algorithm is proposed. Our algorithm firstly represents the mapping relationship between low-level visual features and high-level semantic label with supervised hypergraph model, and then predicts the sample semantic tag with hypergraph regularization method. For controlling the semantic tags prediction error, we also introduce the semantic tags transduction term into NMF framework. We improve the above mentioned algorithm for increasing the accuracy of image annotation. First of all, we extend the hypergraph transduction non-negative matrix factorization from single-view to multi-view. By constraining the consistency of coefficient matrix from different views, we effectively use redundant information between heterogeneous sample features. Secondly, in order to avoid re-learning the image annotation model when a new image or annotation information is added to the sample dataset, we propose a query image driving model, which only uses the k-nearest neighbor training samples to learn image annotation method. It greatly simplifies the difficulty of model learning.
语种	中文
产权排序	1
页码	120页
内容类型	学位论文
源URL	[http://ir.sia.cn/handle/173321/19460]
专题	沈阳自动化研究所_机器人学研究室
推荐引用方式 GB/T 7714	李冰锋. 基于非负矩阵分解的图像聚类和标注方法研究[D]. 中国科学院沈阳自动化研究所. 2016.