CSpace

浏览/检索结果: 共1条,第1-1条 帮助

已选(0)清除 条数/页:   排序方式:
ShuffleInfer: Disaggregate LLM Inference for Mixed Downstream Workloads 期刊论文
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2025, 卷号: 22, 期号: 2, 页码: 24
作者:  Hu, Cunchen;  Huang, Heyang;  Xu, Liangliang;  Chen, Xusheng;  Wang, Chenxi;  Xu, Jiang;  Chen, Shuang;  Feng, Hao;  Wang, Sa;  Bao, Yungang;  Sun, Ninghui;  Shan, Yizhou
收藏  |  浏览/下载:1/0  |  提交时间:2025/12/03
LLM serving  disaggregated  interference  schedule