CSpace

浏览/检索结果: 共42条,第1-10条 帮助

已选(0)清除 条数/页:   排序方式:
IrGEMM: An Input-Aware Tuning Framework for Irregular GEMM on ARM and X86 CPUs 期刊论文
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2024, 卷号: 35, 期号: 9, 页码: 1672-1689
作者:  Wei, Cunyang;  Jia, Haipeng;  Zhang, Yunquan;  Yao, Jianyu;  Li, Chendi;  Cao, Wenxuan
收藏  |  浏览/下载:1/0  |  提交时间:2024/12/06
Kernel  Libraries  Computer architecture  Tuning  Layout  Optimization  Codes  Batch GEMM  code generation  compact GEMM  dynamic programming  TSMM  
Redesigning OpenKMC for Multi-Component Trillion-Atom Simulations on the New Sunway Supercomputer 期刊论文
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 卷号: 34, 期号: 7, 页码: 1997-2010
作者:  Xu, Lei;  Shang, Honghui;  Chen, Xin;  Zhang, Yunquan;  Wang, Lifang;  Gao, Xingyu;  Song, Haifeng
收藏  |  浏览/下载:18/0  |  提交时间:2023/12/04
Metals  Computational modeling  Monte Carlo methods  Kinetic theory  Aging  Steel  Silicon  Atomic kinetic Monte Carlo  many-core processor  scalability  
AGCM-3DLF: Accelerating Atmospheric General Circulation Model via 3-D Parallelization and Leap-Format 期刊论文
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 卷号: 34, 期号: 3, 页码: 766-780
作者:  Cao, Hang;  Yuan, Liang;  Zhang, He;  Zhang, Yunquan;  Wu, Baodong;  Li, Kun;  Li, Shigang;  Zhang, Minghua;  Lu, Pengqi;  Xiao, Junmin
收藏  |  浏览/下载:24/0  |  提交时间:2023/07/12
Atmospheric general circulation model  3-D decomposition  leap-format finite-difference  heterogeneous acceleration  
An Accurate and Efficient Large-Scale Regression Method Through Best Friend Clustering 期刊论文
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 卷号: 33, 期号: 11, 页码: 3129-3140
作者:  Li, Kun;  Yuan, Liang;  Zhang, Yunquan;  Chen, Gongwei
收藏  |  浏览/下载:40/0  |  提交时间:2022/12/07
Clustering algorithms  Training  Mathematical models  Computational modeling  Libraries  Kernel  Support vector machines  Distributed machine learning  scalable algorithm  large-scale clustering  parallel regression  
Adaptive Federated Learning With Non-IID Data 期刊论文
COMPUTER JOURNAL, 2022, 页码: 15
作者:  Zeng, Yan;  Mu, Yuankai;  Yuan, Junfeng;  Teng, Siyuan;  Zhang, Jilin;  Wan, Jian;  Ren, Yongjian;  Zhang, Yunquan
收藏  |  浏览/下载:20/0  |  提交时间:2023/07/12
Federated Learning  Model Aggregation  Non-IID  
Scaling Poisson Solvers on Many Cores via MMEwald 期刊论文
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 卷号: 33, 期号: 8, 页码: 1888-1901
作者:  Wu, Mingchuan;  Wu, Yangjun;  Shang, Honghui;  Liu, Ying;  Cui, Huimin;  Li, Fang;  Duan, Xiaohui;  Zhang, Yunquan;  Feng, Xiaobing
收藏  |  浏览/下载:44/0  |  提交时间:2022/06/21
Optimization  Bandwidth  Supercomputers  Electric potential  Boundary conditions  Electrostatics  Silicon  Poisson solver  architecture-specific optimizations  many-core processor  
Many-core acceleration of the first-principles all-electron quantum perturbation calculations 期刊论文
COMPUTER PHYSICS COMMUNICATIONS, 2021, 卷号: 267, 页码: 8
作者:  Shang, Honghui;  Duan, Xiaohui;  Li, Fang;  Zhang, Libo;  Xu, Zhiqian;  Liu, Kan;  Luo, Haiwen;  Ji, Yingrui;  Zhao, Wenxuan;  Xue, Wei;  Chen, Li;  Zhang, Yunquan
收藏  |  浏览/下载:44/0  |  提交时间:2021/12/01
Density-functional perturbation theory  Many-core architecture  Linear scaling  MPI  Numeric atomic orbitals  
Why Dataset Properties Bound the Scalability of Parallel Machine Learning Training Algorithms 期刊论文
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 卷号: 32, 期号: 7, 页码: 1702-1712
作者:  Cheng, Daning;  Li, Shigang;  Zhang, Hanping;  Xia, Fen;  Zhang, Yunquan
收藏  |  浏览/下载:42/0  |  提交时间:2021/12/01
Training  Scalability  Machine learning  Machine learning algorithms  Stochastic processes  Task analysis  Upper bound  Parallel training algorithms  training dataset  scalability  stochastic optimization methods  
Efficient parallel linear scaling method to get the response density matrix in all-electron real-space density-functional perturbation theory 期刊论文
COMPUTER PHYSICS COMMUNICATIONS, 2021, 卷号: 258, 页码: 11
作者:  Shang, Honghui;  Liang, WanZhen;  Zhang, Yunquan;  Yang, Jinlong
收藏  |  浏览/下载:42/0  |  提交时间:2021/12/01
Density-functional perturbation theory  Linear scaling  MPI  Numeric atomic orbitals  Density-function theory  
WP-SGD: Weighted parallel SGD for distributed unbalanced-workload training system 期刊论文
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2020, 卷号: 145, 页码: 202-216
作者:  Cheng Daning;  Li Shigang;  Zhang Yunquan
收藏  |  浏览/下载:57/0  |  提交时间:2020/12/10
SGD  Unbalanced workload  SimuParallel SGD  Distributed system