CSpace

浏览/检索结果: 共116条,第1-10条 帮助

已选(0)清除 条数/页:   排序方式:
Mortar-FP8: Morphing the Existing FP32 Infrastructure for High-Performance Deep Learning Acceleration 期刊论文
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 卷号: 43, 期号: 3, 页码: 878-891
作者:  Li, Hongyan;  Lu, Hang;  Li, Xiaowei
收藏  |  浏览/下载:2/0  |  提交时间:2024/05/20
Deep learning accelerator  deep neural network (DNN)  fp8 format  
Double Precision is not Necessary for LSQR for Solving Discrete Linear Ill-Posed Problems 期刊论文
JOURNAL OF SCIENTIFIC COMPUTING, 2024, 卷号: 98, 期号: 3, 页码: 30
作者:  Li, Haibo
收藏  |  浏览/下载:1/0  |  提交时间:2024/05/20
Mixed precision  Linear ill-posed problem  Regularization  LSQR  Roundoff unit  Semi-convergence  
Fpar: filter pruning via attention and rank enhancement for deep convolutional neural networks acceleration 期刊论文
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 页码: 13
作者:  Chen, Yanming;  Wu, Gang;  Shuai, Mingrui;  Lou, Shubin;  Zhang, Yiwen;  An, Zhulin
收藏  |  浏览/下载:2/0  |  提交时间:2024/05/20
Neural network  Model compression  Filter pruning  Attention  Rank enhancement  CNNs  
Learned Image Compression Using Cross-Component Attention Mechanism 期刊论文
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 卷号: 32, 页码: 5478-5493
作者:  Duan, Wenhong;  Chang, Zheng;  Jia, Chuanmin;  Wang, Shanshe;  Ma, Siwei;  Song, Li;  Gao, Wen
收藏  |  浏览/下载:7/0  |  提交时间:2023/12/04
Image coding  Context modeling  Transforms  Decoding  Standards  Image reconstruction  Transform coding  Image compression  cross-component  information-guided unit  attention mechanism  information-preserving  
BitXpro: Regularity-Aware Hardware Runtime Pruning for Deep Neural Networks 期刊论文
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2023, 卷号: 31, 期号: 1, 页码: 90-103
作者:  Li, Hongyan;  Lu, Hang;  Wang, Haoxuan;  Deng, Shengji;  Li, Xiaowei
收藏  |  浏览/下载:13/0  |  提交时间:2023/07/12
Deep learning accelerator  deep neural network (DNN)  hardware runtime pruning  
Enabling In-Network Floating-Point Arithmetic for Efficient Computation Offloading 期刊论文
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 卷号: 33, 期号: 12, 页码: 4918-4934
作者:  Cui, Penglai;  Pan, Heng;  Li, Zhenyu;  Zhang, Penghao;  Miao, Tianhao;  Zhou, Jianer;  Guan, Hongtao;  Xie, Gaogang
收藏  |  浏览/下载:14/0  |  提交时间:2023/07/12
Open area test sites  Arithmetic  Memory management  Task analysis  Training  Standards  Servers  In-network computation  computation offloading  floating-point operation  
FTT-NAS: Discovering Fault-tolerant Convolutional Neural Architecture 期刊论文
ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2021, 卷号: 26, 期号: 6, 页码: 24
作者:  Ning, Xuefei;  Ge, Guangjun;  Li, Wenshuo;  Zhu, Zhenhua;  Zheng, Yin;  Chen, Xiaoming;  Gao, Zhen;  Wang, Yu;  Yang, Huazhong
收藏  |  浏览/下载:18/0  |  提交时间:2022/12/07
Neural architecture search  fault tolerance  neural networks  
Optimizing the LINPACK Algorithm for Large-Scale PCIe-Based CPU-GPU Heterogeneous Systems 期刊论文
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 卷号: 32, 期号: 9, 页码: 2367-2380
作者:  Tan, Guangming;  Shui, Chaoyang;  Wang, Yinshan;  Yu, Xianzhi;  Yan, Yujin
收藏  |  浏览/下载:38/0  |  提交时间:2021/12/01
Pipeline processing  Graphics processing units  Computer architecture  Supercomputers  Clustering algorithms  Programming  Optimization  LINPACK algorithm  software pipeline  performance model  heterogeneous computing  cluster  
An efficient dataflow accelerator for scientific applications 期刊论文
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 卷号: 112, 页码: 580-588
作者:  Ye, Xiaochun;  Tan, Xu;  Wu, Meng;  Feng, Yujing;  Wang, Da;  Zhang, Hao;  Pei, Songwen;  Fan, Dongrui
收藏  |  浏览/下载:219/0  |  提交时间:2020/12/10
Dataflow architecture  Scientific computing  Instruction level parallelism  
Automatic Generation of High-Performance FFT Kernels on Arm and X86 CPUs 期刊论文
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 卷号: 31, 期号: 8, 页码: 1925-1941
作者:  Li, Zhihao;  Jia, Haipeng;  Zhang, Yunquan;  Chen, Tun;  Yuan, Liang;  Vuduc, Richard
收藏  |  浏览/下载:56/0  |  提交时间:2020/12/10
AutoFFT  FFT  code generation  template  DFT