Fast Parallel Path Concatenation for Graph Extraction | |
Shao, Yingxia ; Lei, Kai ; Chen, Lei ; Huang, Zi ; Cui, Bin ; Liu, Zhongyi ; Tong, Yunhai ; Xu, Jin | |
刊名 | IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING |
2017 | |
关键词 | Heterogenegous graph graph extraction path concatenation parallelisim SIMRANK CUBE |
DOI | 10.1109/TKDE.2017.2716939 |
英文摘要 | Heterogeneous graph is a popular data model to represent the real-world relations with abundant semantics. To analyze heterogeneous graphs, an important step is extracting homogeneous graphs from the heterogeneous graphs, called homogeneous graph extraction. In an extracted homogeneous graph, the relation is defined by a line pattern on the heterogeneous graph and the new attribute values of the relation are calculated by user-defined aggregate functions. The key challenges of the extraction problem are how to efficiently enumerate paths matched by the line pattern and aggregate values for each pair of vertices from the matched paths. To address above two challenges, we propose a parallel graph extraction framework, where we use vertex-centric model to enumerate paths and compute aggregate functions in parallel. The framework compiles the line pattern into a path concatenation plan, which determines the order of concatenating paths and generates the final paths in a divide-and-conquer manner. We introduce a cost model to estimate the cost of a plan and discuss three plan selection strategies, among which the best plan can enumerate paths in O(log(l)) iterations, where l is the length of a pattern. Furthermore, to improve the performance of evaluating aggregate functions, we classify the aggregate functions into three categories, i.e., distributive aggregation, algebraic aggregation, and holistic aggregation. Since the distributive and algebraic aggregations can be computed from the partial paths, we speed up the aggregation by computing partial aggregate values during the path enumeration.; 973 program [2014CB340405]; China Postdoctoral Science Foundation [2017M610020]; National Natural Science Foundation of China [61572039, 61472009, U1536201]; Shenzhen Gov Research Project [CYJ20151014093505032]; Alibaba-PKU joint Program; Australian Research Council [FT130101530]; Hong Kong RGC Project [16202215]; 973 Program of China [2014CB340303]; NSFC [61502021, 61328202, 61300031, 61532004, U1301253]; Microsoft Research Asia Collaborative Grant; SCI(E); ARTICLE; 10; 2210-2222; 29 |
语种 | 英语 |
内容类型 | 期刊论文 |
源URL | [http://ir.pku.edu.cn/handle/20.500.11897/470652] |
专题 | 信息科学技术学院 |
推荐引用方式 GB/T 7714 | Shao, Yingxia,Lei, Kai,Chen, Lei,et al. Fast Parallel Path Concatenation for Graph Extraction[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,2017. |
APA | Shao, Yingxia.,Lei, Kai.,Chen, Lei.,Huang, Zi.,Cui, Bin.,...&Xu, Jin.(2017).Fast Parallel Path Concatenation for Graph Extraction.IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. |
MLA | Shao, Yingxia,et al."Fast Parallel Path Concatenation for Graph Extraction".IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2017). |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论