CORC  > 北京大学  > 数学科学学院
Hierarchical taxonomy preparation for text categorization using consistent bipartite spectral graph copartitioning
Gao, B ; Liu, TY ; Feng, G ; Qin, T ; Cheng, QS ; Ma, WY
2005
关键词clustering data mining singular value decomposition text processing
英文摘要Multiclass classification has been investigated for many years in the literature. Recently, the scales of real-world multiclass classification applications have become larger and larger. For example, there are hundreds of thousands of categories employed in the Open Directory Project (ODP) and the Yahoo! directory. In such cases, the scalability of classification methods turns out to be a major concern. To tackle this problem, hierarchical classification is proposed and widely adopted to get better trade-off between effectiveness and efficiency. Unfortunately, many data sets are not explicitly organized in hierarchical forms and, therefore, hierarchical classification cannot be used directly. In this paper, we propose a novel algorithm to automatically mine a hierarchical structure from the flat taxonomy of a data corpus as a preparation for the adoption of hierarchical classification. In particular, we first compute matrices to represent the relations among categories, documents, and terms. And, then, we cocluster the three substances at different scales through consistent bipartite spectral graph copartitioning, which is formulated as a generalized singular value decomposition problem. At last, a hierarchical taxonomy is constructed from the category clusters. Our experiments showed that the proposed algorithm could discover very reasonable taxonomy hierarchy and help improve the classification accuracy.; Computer Science, Artificial Intelligence; Computer Science, Information Systems; Engineering, Electrical & Electronic; SCI(E); CPCI-S(ISTP); 11
语种英语
出处SCI
内容类型其他
源URL[http://hdl.handle.net/20.500.11897/314984]  
专题数学科学学院
推荐引用方式
GB/T 7714
Gao, B,Liu, TY,Feng, G,et al. Hierarchical taxonomy preparation for text categorization using consistent bipartite spectral graph copartitioning. 2005-01-01.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace