CSpace

浏览/检索结果: 共75条,第1-10条 帮助

已选(0)清除 条数/页:   排序方式:
Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs 期刊论文
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2024, 卷号: 21, 期号: 1, 页码: 26
作者:  Wang, Xueying;  Li, Guangli;  Jia, Zhen;  Feng, Xiaobing;  Wang, Yida
收藏  |  浏览/下载:12/0  |  提交时间:2024/05/20
Deep learning  winograd convolution  low-precision computation  
OpenCL-accelerated first-principles calculations of all-electron quantum perturbations on HPC resources (vol 11, 1156891, 2023) 期刊论文
FRONTIERS IN CHEMISTRY, 2023, 卷号: 11, 页码: 1
作者:  Wu, Zhikun;  Shang, Honghui;  Wu, Yangjun;  Zhang, Zhongcheng;  Liu, Ying;  Zhang, Yuyang;  Ouyang, Yucheng;  Cui, Huimin;  Feng, Xiaobing
收藏  |  浏览/下载:15/0  |  提交时间:2023/12/04
OpenCL  DFPT  GPU  optimization  heterogeneous  
OpenCL-accelerated first-principles calculations of all-electron quantum perturbations on HPC resources 期刊论文
FRONTIERS IN CHEMISTRY, 2023, 卷号: 11, 页码: 15
作者:  Wu, Zhikun;  Shang, Honghui;  Wu, Yangjun;  Zhang, Zhongcheng;  Liu, Ying;  Zhang, Yuyang;  Ouyang, Yucheng;  Cui, Huimin;  Feng, Xiaobing
收藏  |  浏览/下载:15/0  |  提交时间:2023/12/04
OpenCL  DFPT  GPU  optimization  heterogeneous  
An Application-oblivious Memory Scheduling System for DNN Accelerators 期刊论文
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2022, 卷号: 19, 期号: 4, 页码: 26
作者:  Li, Jiansong;  Wang, Xueying;  Chen, Xiaobing;  Li, Guangli;  Dong, Xiao;  Zhao, Peng;  Yu, Xianzhi;  Yang, Yongxin;  Cao, Wei;  Liu, Lei;  Feng, Xiaobing
收藏  |  浏览/下载:22/0  |  提交时间:2023/07/12
Deep learning  memory scheduling  runtime system  DNN accelerators  
Scaling Poisson Solvers on Many Cores via MMEwald 期刊论文
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 卷号: 33, 期号: 8, 页码: 1888-1901
作者:  Wu, Mingchuan;  Wu, Yangjun;  Shang, Honghui;  Liu, Ying;  Cui, Huimin;  Li, Fang;  Duan, Xiaohui;  Zhang, Yunquan;  Feng, Xiaobing
收藏  |  浏览/下载:44/0  |  提交时间:2022/06/21
Optimization  Bandwidth  Supercomputers  Electric potential  Boundary conditions  Electrostatics  Silicon  Poisson solver  architecture-specific optimizations  many-core processor  
Optimizing deep neural networks on intelligent edge accelerators via flexible-rate filter pruning 期刊论文
JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 卷号: 124, 页码: 11
作者:  Li, Guangli;  Ma, Xiu;  Wang, Xueying;  Yue, Hengshan;  Li, Jiansong;  Liu, Lei;  Feng, Xiaobing;  Xue, Jingling
收藏  |  浏览/下载:34/0  |  提交时间:2022/12/07
Edge intelligence  Deep learning  Neural network compression  
Compiler-assisted Operator Template Library for DNN Accelerators 期刊论文
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2021, 页码: 18
作者:  Li, Jiansong;  Cao, Wei;  Dong, Xiao;  Li, Guangli;  Wang, Xueying;  Zhao, Peng;  Liu, Lei;  Feng, Xiaobing
收藏  |  浏览/下载:43/0  |  提交时间:2021/12/01
DNN Accelerators  Template Library  Address Space Management  
Fusion-Catalyzed Pruning for Optimizing Deep Learning on Intelligent Edge Devices 期刊论文
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 卷号: 39, 期号: 11, 页码: 3614-3626
作者:  Li, Guangli;  Ma, Xiu;  Wang, Xueying;  Liu, Lei;  Xue, Jingling;  Feng, Xiaobing
收藏  |  浏览/下载:56/0  |  提交时间:2021/12/01
Deep learning system  edge intelligence  model compression and acceleration  neural networks  
ParaML: A Polyvalent Multicore Accelerator for Machine Learning 期刊论文
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 卷号: 39, 期号: 9, 页码: 1764-1777
作者:  Zhou, Shengyuan;  Guo, Qi;  Du, Zidong;  Liu, Daofu;  Chen, Tianshi;  Li, Ling;  Liu, Shaoli;  Zhou, Jinhong;  Temam, Olivier;  Feng, Xiaobing;  Zhou, Xuehai;  Chen, Yunji
收藏  |  浏览/下载:60/0  |  提交时间:2020/12/10
Neural networks  Machine learning  Testing  Support vector machines  Linear regression  Computers  Computer architecture  Accelerator  machine learning (ML) techniques  multicore accelerator  
面向稀疏卷积神经网络的GPU性能优化方法 期刊论文
软件学报, 2020, 卷号: 31, 期号: 9, 页码: 2944
作者:  董晓;  刘雷;  李晶;  冯晓兵
收藏  |  浏览/下载:14/0  |  提交时间:2023/12/04
neural networks  sparse  GPU  performance optimization  convolution  code generation  神经网络  稀疏  GPU  性能优化  卷积  代码生成