CORC  > 北京大学  > 信息科学技术学院
Accelerating sequence searching: dimensionality reduction method
Song, Guojie ; Cui, Bin ; Zheng, Baihua ; Xie, Kunqing ; Yang, Dongqing
刊名knowledge and information systems
2009
关键词Sequence similarity search Sequence embedding Index Dimension reduction ALGORITHM
DOI10.1007/s10115-008-0180-0
英文摘要Similarity search over long sequence dataset becomes increasingly popular in many emerging applications, such as text retrieval, genetic sequences exploring, etc. In this paper, a novel index structure, namely Sequence Embedding Multiset tree (SEM - tree), has been proposed to speed up the searching process over long sequences. The SEM-tree is a multi-level structure where each level represents the sequence data with different compression level of multiset, and the length of multiset increases towards the leaf level which contains original sequences. The multisets, obtained using sequence embedding algorithms, have the desirable property that they do not need to keep the character order in the sequence, i.e. shorter representation, but can reserve the majority of distance information of sequences. Each level of the tree serves to prune the search space more efficiently as the multisets utilize the predicability to finish the searching process beforehand and reduce the computational cost greatly. A set of comprehensive experiments are conducted to evaluate the performance of the SEM-tree, and the experimental results show that the proposed method is much more efficient than existing representative methods.; Computer Science, Artificial Intelligence; Computer Science, Information Systems; SCI(E); 6; ARTICLE; 3; 301-322; 20
语种英语
内容类型期刊论文
源URL[http://ir.pku.edu.cn/handle/20.500.11897/152877]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
Song, Guojie,Cui, Bin,Zheng, Baihua,et al. Accelerating sequence searching: dimensionality reduction method[J]. knowledge and information systems,2009.
APA Song, Guojie,Cui, Bin,Zheng, Baihua,Xie, Kunqing,&Yang, Dongqing.(2009).Accelerating sequence searching: dimensionality reduction method.knowledge and information systems.
MLA Song, Guojie,et al."Accelerating sequence searching: dimensionality reduction method".knowledge and information systems (2009).
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace