Institute of Computing Technology, Chinese Academy IR
Automatic performance debugging of SPMD-style parallel programs | |
Liu, Xu1,4; Zhan, Jianfeng1; Zhan, Kunlin3; Shi, Weisong2; Yuan, Lin1,3,5; Meng, Dan1; Wang, Lei1 | |
2011-07-01 | |
发表期刊 | JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING |
ISSN | 0743-7315 |
卷号 | 71期号:7页码:925-937 |
摘要 | Automatic performance debugging of parallel applications includes two main steps: locating performance bottlenecks and uncovering their root causes for performance optimization. Previous work fails to resolve this challenging issue in two ways: first, several previous efforts automate locating bottlenecks, but present results in a confined way that only identifies performance problems with a priori knowledge: second, several tools take exploratory or confirmatory data analysis to automatically discover relevant performance data relationships, but these efforts do not focus on locating performance bottlenecks or uncovering their root causes. The simple program and multiple data (SPMD) programming model is widely used for both high performance computing and Cloud computing. In this paper, we design and implement an innovative system, AutoAnalyzer, that automates the process of debugging performance problems of SPMD-style parallel programs, including data collection, performance behavior analysis, locating bottlenecks, and uncovering their root causes. AutoAnalyzer is unique in terms of two features: first, without any prior knowledge, it automatically locates bottlenecks and uncovers their root causes for performance optimization; second, it is lightweight in terms of the size of performance data to be collected and analyzed. Our contributions are three-fold: first, we propose two effective clustering algorithms to investigate the existence of performance bottlenecks that cause process behavior dissimilarity or code region behavior disparity, respectively; meanwhile, we present two searching algorithms to locate bottlenecks; second, on the basis of the rough set theory, we propose an innovative approach to automatically uncover root causes of bottlenecks; third, on the cluster systems with two different configurations, we use two production applications, written in Fortran 77, and one open source code MPIBZIP2 (http://compression.ca/mpibzip2/), written in C++, to verify the effectiveness and correctness of our methods. For three applications, we also propose an experimental approach to investigating the effects of different metrics on locating bottlenecks. (C) 2011 Elsevier Inc. All rights reserved. |
关键词 | SPMD parallel programs Automatic performance debugging Performance bottleneck Root cause analysis Performance optimization |
DOI | 10.1016/j.jpdc.2011.03.006 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | NSFC[60703020] ; NSFC[60933003] ; Chinese national 973 project[2011CB302500] ; Chinese national 863 project[2009AA01Z128] |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Theory & Methods |
WOS记录号 | WOS:000291287900004 |
出版者 | ACADEMIC PRESS INC ELSEVIER SCIENCE |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/12849 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Zhan, Jianfeng |
作者单位 | 1.China Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China 2.Wayne State Univ, Dept Comp Sci, Detroit, MI 48202 USA 3.Chinese Acad Sci, Grad Univ, Beijing 100864, Peoples R China 4.Rice Univ, Dept Comp Sci, Houston, TX 77251 USA 5.Chinese Acad Sci, Inst Comp Technol, Beijing 100864, Peoples R China |
推荐引用方式 GB/T 7714 | Liu, Xu,Zhan, Jianfeng,Zhan, Kunlin,et al. Automatic performance debugging of SPMD-style parallel programs[J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING,2011,71(7):925-937. |
APA | Liu, Xu.,Zhan, Jianfeng.,Zhan, Kunlin.,Shi, Weisong.,Yuan, Lin.,...&Wang, Lei.(2011).Automatic performance debugging of SPMD-style parallel programs.JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING,71(7),925-937. |
MLA | Liu, Xu,et al."Automatic performance debugging of SPMD-style parallel programs".JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING 71.7(2011):925-937. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论