CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
HybridTune: Spatio-Temporal Performance Data Correlation for Performance Diagnosis of Big Data Systems
Ren, Rui1,2; Cheng, Jiechao3; He, Xi-Wen1; Wang, Lei1; Zhan, Jian-Feng1; Gao, Wan-Ling1; Luo, Chun-Jie1,2
2019-11-01
发表期刊JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY
ISSN1000-9000
卷号34期号:6页码:1167-1184
摘要With tremendous growing interests in Big Data, the performance improvement of Big Data systems becomes more and more important. Among many steps, the first one is to analyze and diagnose performance bottlenecks of the Big Data systems. Currently, there are two major solutions. One is the pure data-driven diagnosis approach, which may be very time-consuming; the other is the rule-based analysis method, which usually requires prior knowledge. For Big Data applications like Spark workloads, we observe that the tasks in the same stages normally execute the same or similar codes on each data partition. On basis of the stage similarity and distributed characteristics of Big Data systems, we analyze the behaviors of the Big Data applications in terms of both system and micro-architectural metrics of each stage. Furthermore, for different performance problems, we propose a hybrid approach that combines prior rules and machine learning algorithms to detect performance anomalies, such as straggler tasks, task assignment imbalance, data skew, abnormal nodes and outlier metrics. Following this methodology, we design and implement a lightweight, extensible tool, named HybridTune, and measure the overhead and anomaly detection effectiveness of HybridTune using the BigDataBench benchmarks. Our experiments show that the overhead of HybridTune is only 5%, and the accuracy of outlier detection algorithm reaches up to 93%. Finally, we report several use cases diagnosing Spark and Hadoop workloads using BigDataBench, which demonstrates the potential use of HybridTune.
关键词Big Data system spatio-temporal correlation rule-based diagnosis machine learning
DOI10.1007/s11390-019-1968-y
收录类别SCI
语种英语
资助项目National Key Research and Development Program of China[2016YFB1000601]
WOS研究方向Computer Science
WOS类目Computer Science, Hardware & Architecture ; Computer Science, Software Engineering
WOS记录号WOS:000511331700001
出版者SCIENCE PRESS
引用统计
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/14418
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Zhan, Jian-Feng
作者单位1.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
3.Natl Univ Singapore, Sch Comp, Singapore 117417, Singapore
推荐引用方式
GB/T 7714
Ren, Rui,Cheng, Jiechao,He, Xi-Wen,et al. HybridTune: Spatio-Temporal Performance Data Correlation for Performance Diagnosis of Big Data Systems[J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY,2019,34(6):1167-1184.
APA Ren, Rui.,Cheng, Jiechao.,He, Xi-Wen.,Wang, Lei.,Zhan, Jian-Feng.,...&Luo, Chun-Jie.(2019).HybridTune: Spatio-Temporal Performance Data Correlation for Performance Diagnosis of Big Data Systems.JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY,34(6),1167-1184.
MLA Ren, Rui,et al."HybridTune: Spatio-Temporal Performance Data Correlation for Performance Diagnosis of Big Data Systems".JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 34.6(2019):1167-1184.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Ren, Rui]的文章
[Cheng, Jiechao]的文章
[He, Xi-Wen]的文章
百度学术
百度学术中相似的文章
[Ren, Rui]的文章
[Cheng, Jiechao]的文章
[He, Xi-Wen]的文章
必应学术
必应学术中相似的文章
[Ren, Rui]的文章
[Cheng, Jiechao]的文章
[He, Xi-Wen]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。