CORC  > 北京大学  > 信息科学技术学院
Term Space Partition Based Ensemble Feature Construction for Spam Detection
Mi, Guyue ; Gao, Yang ; Tan, Ying
2016
关键词Term space partition (TSP) Ensemble term space partition (ETSP) Feature construction Spam detection Text categorization
英文摘要This paper proposes an ensemble feature construction method for spam detection by using the term space partition (TSP) approach, which aims to establish a mechanism to make terms play more sufficient and rational roles by dividing the original term space and constructing discriminative features on distinct subspaces. The ensemble features are constructed by taking both global and local features of emails into account in feature perspective, where variable-length sliding window technique is adopted. Experiments conducted on five benchmark corpora suggest that the ensemble feature construction method far outperforms not only the traditional and most widely used bag-of-words model, but also the heuristic and state-of-the-art immune concentration based feature construction approaches. Compared to the original TSP approach, the ensemble method achieves better performance and robustness, providing an alternative mechanism of reliability for different application scenarios.; CPCI-S(ISTP); gymi@pku.edu.cn; gaoyang0115@pku.edu.cn; ytan@pku.edu.cn; 205-216; 9714
语种英语
出处1st International Conference on Data Mining and Big Data (DMBD)
DOI标识10.1007/978-3-319-40973-3_20
内容类型其他
源URL[http://ir.pku.edu.cn/handle/20.500.11897/460184]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
Mi, Guyue,Gao, Yang,Tan, Ying. Term Space Partition Based Ensemble Feature Construction for Spam Detection. 2016-01-01.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace