CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
SAHA: A String Adaptive Hash Table for Analytical Databases
Zheng, Tianqi1,2; Zhang, Zhibin1; Cheng, Xueqi1,2
2020-03-01
发表期刊APPLIED SCIENCES-BASEL
卷号10期号:6页码:18
摘要Hash tables are the fundamental data structure for analytical database workloads, such as aggregation, joining, set filtering and records deduplication. The performance aspects of hash tables differ drastically with respect to what kind of data are being processed or how many inserts, lookups and deletes are constructed. In this paper, we address some common use cases of hash tables: aggregating and joining over arbitrary string data. We designed a new hash table, SAHA, which is tightly integrated with modern analytical databases and optimized for string data with the following advantages: (1) it inlines short strings and saves hash values for long strings only; (2) it uses special memory loading techniques to do quick dispatching and hashing computations; and (3) it utilizes vectorized processing to batch hashing operations. Our evaluation results reveal that SAHA outperforms state-of-the-art hash tables by one to five times in analytical workloads, including Google's SwissTable and Facebook's F14Table. It has been merged into the ClickHouse database and shows promising results in production.
关键词hash table analytical database string data
DOI10.3390/app10061915
收录类别SCI
语种英语
资助项目Strategic Priority Research Program of the Chinese Academy of Sciences[XDA19020400]
WOS研究方向Chemistry ; Engineering ; Materials Science ; Physics
WOS类目Chemistry, Multidisciplinary ; Engineering, Multidisciplinary ; Materials Science, Multidisciplinary ; Physics, Applied
WOS记录号WOS:000529252800016
出版者MDPI
引用统计
被引频次:4[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/15030
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Zheng, Tianqi
作者单位1.Chinese Acad Sci, Inst Comp Technol, CAS Key Lab Network Data Sci & Technol, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
推荐引用方式
GB/T 7714
Zheng, Tianqi,Zhang, Zhibin,Cheng, Xueqi. SAHA: A String Adaptive Hash Table for Analytical Databases[J]. APPLIED SCIENCES-BASEL,2020,10(6):18.
APA Zheng, Tianqi,Zhang, Zhibin,&Cheng, Xueqi.(2020).SAHA: A String Adaptive Hash Table for Analytical Databases.APPLIED SCIENCES-BASEL,10(6),18.
MLA Zheng, Tianqi,et al."SAHA: A String Adaptive Hash Table for Analytical Databases".APPLIED SCIENCES-BASEL 10.6(2020):18.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Zheng, Tianqi]的文章
[Zhang, Zhibin]的文章
[Cheng, Xueqi]的文章
百度学术
百度学术中相似的文章
[Zheng, Tianqi]的文章
[Zhang, Zhibin]的文章
[Cheng, Xueqi]的文章
必应学术
必应学术中相似的文章
[Zheng, Tianqi]的文章
[Zhang, Zhibin]的文章
[Cheng, Xueqi]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。