Institute of Computing Technology, Chinese Academy IR
Load-balancing distributed outer joins through operator decomposition | |
Cheng, Long1; Kotoulas, Spyros2; Liu, Qingzhi3; Wang, Ying4 | |
2019-10-01 | |
发表期刊 | JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING |
ISSN | 0743-7315 |
卷号 | 132页码:21-35 |
摘要 | High-performance data analytics largely relies on being able to efficiently execute various distributed data operators such as distributed joins. So far, large amounts of join methods have been proposed and evaluated in parallel and distributed environments. However, most of them focus on inner joins, and there is little published work providing the detailed implementations and analysis of outer joins. In this work, we present POPI (Partial Outer join & Partial Inner join), a novel method to load-balance large parallel outer joins by decomposing them into two operations: a large outer join over data that does not present significant skew in the input and an inner join over data presenting significant skew. We present the detailed implementation of our approach and show that POPI is implementable over a variety of architectures and underlying join implementations. Moreover, our experimental evaluation over a distributed memory platform also demonstrates that the proposed method is able to improve outer join performance under varying data skew and present excellent load-balancing properties, compared to current approaches. (C) 2019 Elsevier Inc. All rights reserved. |
关键词 | Distributed join Outer join Data skew Load balancing Spark |
DOI | 10.1016/j.jpdc.2019.05.008 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant[799066] |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Theory & Methods |
WOS记录号 | WOS:000476580400003 |
出版者 | ACADEMIC PRESS INC ELSEVIER SCIENCE |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/4460 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Cheng, Long |
作者单位 | 1.Univ Coll Dublin, Sch Comp Sci, Dublin, Ireland 2.IBM Res, Dublin, Ireland 3.Eindhoven Univ Technol, Dept Math & Comp Sci, Eindhoven, Netherlands 4.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China |
推荐引用方式 GB/T 7714 | Cheng, Long,Kotoulas, Spyros,Liu, Qingzhi,et al. Load-balancing distributed outer joins through operator decomposition[J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING,2019,132:21-35. |
APA | Cheng, Long,Kotoulas, Spyros,Liu, Qingzhi,&Wang, Ying.(2019).Load-balancing distributed outer joins through operator decomposition.JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING,132,21-35. |
MLA | Cheng, Long,et al."Load-balancing distributed outer joins through operator decomposition".JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING 132(2019):21-35. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论