CSpace

浏览/检索结果: 共23条,第1-10条 帮助

已选(0)清除 条数/页:   排序方式:
Compressing and Accelerating Sparse CNNs Using Sign-Reserved Toeplitz Filters and Input Activation Density-aware Dataflow 期刊论文
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2025, 卷号: 22, 期号: 4, 页码: 23
作者:  Wang, Zhen;  Liu, Tianyu;  Fan, Zhihua;  Li, Wenming;  Qiu, Yuhang;  Zhang, Zhiyuan;  An, Xuejun;  Fan, Dongrui;  Ye, Xiaochun
收藏  |  浏览/下载:1/0  |  提交时间:2026/05/25
Convolutional neural networks  accelerators  sparsity  algorithm-hardware co-design  
A RISC-V Extended Infrastructure for CNNs Through Pipelined Computing and Data Dependence Optimization 期刊论文
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2025, 卷号: 44, 期号: 11, 页码: 4141-4154
作者:  Luo, Teng;  Xia, Tengfei;  Chen, Jiayuan;  Fan, Zhihua;  Li, Wenming;  Mu, Yudong;  An, Xuejun;  Ye, Xiaochun;  Fan, Dongrui
收藏  |  浏览/下载:25/0  |  提交时间:2025/12/03
Artificial intelligence  Convolution  Convolutional neural networks  Computer architecture  Computational efficiency  Pipelines  Logic  Filters  Fans  Biological system modeling  Convolutional neural networks (CNNs) acceleration  dataflow optimization  pipelined computing  RISC-V extended instructions  
GenCNN: A Partition-Aware Multi-Objective Mapping Framework for CNN Accelerators Based on Genetic Algorithm 期刊论文
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2025, 卷号: 22, 期号: 3, 页码: 26
作者:  Mu, Yudong;  Fan, Zhihua;  Li, Wenming;  Zhang, Zhiyuan;  An, Xuejun;  Fan, Dongrui;  Ye, Xiaochun
收藏  |  浏览/下载:21/0  |  提交时间:2025/12/03
CNN Accelerator  Dataflow Graph Mapping  Genetic Algorithm  Multi-objective Optimization  
CGCGraph: Efficient CPU-GPU Co-execution for Concurrent Dynamic Graph Processing 期刊论文
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2025, 卷号: 22, 期号: 3, 页码: 26
作者:  Sun, Yiming;  Zhang, Jie;  Cao, Huawei;  Zhang, Yuan;  An, Xuejun;  Huang, Junying;  Ye, Xiaochun
收藏  |  浏览/下载:19/0  |  提交时间:2025/12/03
CPU-GPU co-execution  concurrent graph processing  dynamic graph snapshot processing  high throughput  
CODA: A Computation-Driven Paradigm for Sparse DNN Acceleration 期刊论文
IEEE COMPUTER ARCHITECTURE LETTERS, 2025, 卷号: 24, 期号: 2, 页码: 381-384
作者:  Liu, Yanhuan;  Li, Wenming;  Zhang, Kunming;  Liu, Tianyu;  Ye, Xiaochun;  An, Xuejun
收藏  |  浏览/下载:1/0  |  提交时间:2026/05/25
Software  Hardware  Computational modeling  Sparse matrices  Pipelines  Indexes  Data models  Spatial databases  Computational efficiency  Vectors  Computation-driven architecture  sparse DNN acceleration  dataflow paradigm  unstructured sparsity  work tokenizer  dynamic execution core  asynchronous execution  
PANDA: Adaptive Prefetching and Decentralized Scheduling for Dataflow Architectures 期刊论文
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2025, 卷号: 22, 期号: 2, 页码: 27
作者:  Qin, Shantian;  Fan, Zhihua;  Li, Wenming;  Wang, Zhen;  An, Xuejun;  Ye, Xiaochun;  Fan, Dongrui
收藏  |  浏览/下载:24/0  |  提交时间:2025/12/03
Prefetching  decentralized dynamic scheduling  reconfigurable on-chip memory architecture  
Accelerating tensor multiplication by exploring hybrid product with hardware and software co-design 期刊论文
JOURNAL OF SYSTEMS ARCHITECTURE, 2025, 卷号: 159, 页码: 16
作者:  Zhang, Zhiyuan;  Fan, Zhihua;  Li, Wenming;  Qiu, Yuhang;  Wang, Zhen;  Ye, Xiaochun;  Fan, Dongrui;  An, Xuejun
收藏  |  浏览/下载:27/0  |  提交时间:2025/06/25
Tensor multiplication  Hybrid product  Dataflow  Accelerator  
Improving Utilization of Dataflow Unit for Multi-Batch Processing 期刊论文
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2024, 卷号: 21, 期号: 1, 页码: 26
作者:  Fan, Zhihua;  Li, Wenming;  Wang, Zhen;  Yang, Yu;  Ye, Xiaochun;  Fan, Dongrui;  Sun, Ninghui;  An, Xuejun
收藏  |  浏览/下载:65/0  |  提交时间:2024/05/20
Utilization  network-on-chip  decoupled architecture  batch processing  
Accelerating Convolutional Neural Networks by Exploiting the Sparsity of Output Activation 期刊论文
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 卷号: 34, 期号: 12, 页码: 3253-3265
作者:  Fan, Zhihua;  Li, Wenming;  Wang, Zhen;  Liu, Tianyu;  Wu, Haibin;  Liu, Yanhuan;  Wu, Meng;  Wu, Xinxin;  Ye, Xiaochun;  Fan, Dongrui;  Sun, Ninghui;  An, Xuejun
收藏  |  浏览/下载:68/0  |  提交时间:2024/05/20
Accelerator  output activation  prediction  sparse convolutional neural network  
HyperFatTree: A Large-Scale Tree-Based Network with Low-Radix Switches 期刊论文
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2017, 卷号: 45, 期号: 1, 页码: 172-184
作者:  Su, Yong;  Wang, Zhan;  Fan, Zhiguo;  Cao, Zheng;  Liu, Xiaoli;  Shao, En;  An, Xuejun;  Sun, Ninghui
收藏  |  浏览/下载:159/0  |  提交时间:2019/12/12
High energy efficiency  Hierarchical topology  Low-radix switch  Large scale interconnecting network