Institute of Computing Technology, Chinese Academy IR
| Computational Burst Buffers: Accelerating HPC I/O via In-Storage Compression Offloading | |
| Chen, Xiang1; Lu, Bing2,3; Long, Haoquan3,4; Luo, Huizhang2; Ma, Yili3; Tan, Guangming3; Tao, Dingwen3; Wu, Fei1; Lu, Tao5 | |
| 2026-02-01 | |
| 发表期刊 | IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
![]() |
| ISSN | 1045-9219 |
| 卷号 | 37期号:2页码:518-532 |
| 摘要 | Burst buffers (BBs) act as an intermediate storage layer between compute nodes and parallel file systems (PFS), effectively alleviating the I/O performance gap in high-performance computing (HPC). As scientific simulations and AI workloads generate larger checkpoints and analysis outputs, BB capacity shortages and PFS bandwidth bottlenecks are emerging, and CPU-based compression is not an effective solution due to its high overhead. We introduce Computational Burst Buffers (CBBs), a storage paradigm that embeds hardware compression engines such as application-specific integrated circuit (ASIC) inside computational storage drives (CSDs) at the BB tier. CBB transparently offloads both lossless and error-bounded lossy compression from CPUs to CSDs, thereby (i) expanding effective SSD-backed BB capacity, (ii) reducing BB-PFS traffic, and (iii) eliminating contention and energy overheads of CPU-based compression. Unlike prior CSD-based compression designs targeting databases or flash caching, CBB co-designs the burst-buffer layer and CSD hardware for HPC and quantitatively evaluates compression offload in BB-PFS hierarchies. We prototype CBB using a PCIe 5.0 CSD with an ASIC Zstd-like compressor and an FPGA prototype of an SZ entropy encoder, and evaluate CBB on a 16-node cluster. Experiments with four representative HPC applications and a large-scale workflow simulator show up to 61% lower application runtime, 8-12x higher cache hit ratios, and substantially reduced compute-node CPU utilization compared to software compression and conventional BBs. These results demonstrate that compression-aware BBs with CSDs provide a practical, scalable path to next-generation HPC storage. |
| 关键词 | Hardware Computer architecture File systems Nonvolatile memory Bandwidth Engines Prototypes Data compression Software Flash memories high performance computing solid state drives |
| DOI | 10.1109/TPDS.2025.3643175 |
| 收录类别 | SCI |
| 语种 | 英语 |
| WOS研究方向 | Computer Science ; Engineering |
| WOS类目 | Computer Science, Theory & Methods ; Engineering, Electrical & Electronic |
| WOS记录号 | WOS:001655675200001 |
| 出版者 | IEEE COMPUTER SOC |
| 引用统计 | |
| 文献类型 | 期刊论文 |
| 条目标识符 | http://119.78.100.204/handle/2XEOYT63/42918 |
| 专题 | 中国科学院计算技术研究所 |
| 通讯作者 | Luo, Huizhang; Tao, Dingwen; Lu, Tao |
| 作者单位 | 1.Huazhong Univ Sci & Technol, Wuhan 430074, Peoples R China 2.Hunan Univ, Changsha 410008, Peoples R China 3.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China 4.Univ Sci & Technol China, Hefei 230026, Peoples R China 5.DapuStor Corp, Shenzhen 518100, Peoples R China |
| 推荐引用方式 GB/T 7714 | Chen, Xiang,Lu, Bing,Long, Haoquan,et al. Computational Burst Buffers: Accelerating HPC I/O via In-Storage Compression Offloading[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,2026,37(2):518-532. |
| APA | Chen, Xiang.,Lu, Bing.,Long, Haoquan.,Luo, Huizhang.,Ma, Yili.,...&Lu, Tao.(2026).Computational Burst Buffers: Accelerating HPC I/O via In-Storage Compression Offloading.IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,37(2),518-532. |
| MLA | Chen, Xiang,et al."Computational Burst Buffers: Accelerating HPC I/O via In-Storage Compression Offloading".IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 37.2(2026):518-532. |
| 条目包含的文件 | 条目无相关文件。 | |||||
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论