CORC  > 北京大学  > 信息科学技术学院
Group-Scheme: SIMD-based Compression Algorithms for Web Text Data
Zhang, Xudong ; Zhao, Wayne Xin ; Shan, Dongdong ; Yan, Hongfei
2013
关键词SIMD inverted index index compression integer encoding
英文摘要Compression algorithms have been quite important for data oriented tasks, especially in the era of Big Data. The rapid development of modern processors facilitates us with powerful SIMD instruction sets, which provides an opportunity for better performance. Although SIMD based optimization on compression have been explored in some studies [2, 7], these studies usually focus on modifying the existing algorithms to fit into the SIMD instruction. In this paper, we propose a compression framework with a novel storage layout format, which aims to improve instruction-level parallelizability of compression algorithms. By instantiating the framework, we design a novel compression algorithm family, called Group-Scheme, and present a parallelized version of Group-Scheme, called SIMD-Group-Scheme. We evaluate the proposed algorithms on two public TREC data sets. With very competitive performance on compression ratio and encoding speed, SIMD-Group-Scheme significantly outperforms the implementation without SIMD instructions and state-of-the-art algorithm (i.e. SIMD-G8IU [7]), w.r.t decoding speed.; Computer Science, Information Systems; Computer Science, Theory & Methods; Engineering, Electrical & Electronic; EI; CPCI-S(ISTP); 0
语种英语
DOI标识10.1109/BigData.2013.6691617
内容类型其他
源URL[http://ir.pku.edu.cn/handle/20.500.11897/292540]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
Zhang, Xudong,Zhao, Wayne Xin,Shan, Dongdong,et al. Group-Scheme: SIMD-based Compression Algorithms for Web Text Data. 2013-01-01.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace