CSpace

浏览/检索结果: 共8条,第1-8条 帮助

限定条件        
已选(0)清除 条数/页:   排序方式:
Cross Modal Compression With Variable Rate Prompt 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 卷号: 26, 页码: 3444-3456
作者:  Gao, Junlong;  Li, Jiguo;  Jia, Chuanmin;  Wang, Shanshe;  Ma, Siwei;  Gao, Wen
收藏  |  浏览/下载:2/0  |  提交时间:2024/05/20
Cross modal compression  semantic fidelity  variable rate prompt  
STAM: A SpatioTemporal Attention Based Memory for Video Prediction 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 卷号: 25, 页码: 2354-2367
作者:  Chang, Zheng;  Zhang, Xinfeng;  Wang, Shanshe;  Ma, Siwei;  Gao, Wen
收藏  |  浏览/下载:7/0  |  提交时间:2023/12/04
Global spatiotemporal information  spatio temporal receptive field  3D convolutional neural network  spatiotemporal attention  sequence learning  video prediction  
Neighborhood Contrastive Transformer for Change Captioning 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 卷号: 25, 页码: 9518-9529
作者:  Tu, Yunbin;  Li, Liang;  Su, Li;  Lu, Ke;  Huang, Qingming
收藏  |  浏览/下载:2/0  |  提交时间:2024/05/20
Change captioning  neighborhood contrastive transformer  syntax dependencies  
Refined Knowledge Transfer for Language-Based Person Search 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 卷号: 25, 页码: 9315-9329
作者:  Wu, Ziqiang;  Ma, Bingpeng;  Chang, Hong;  Shan, Shiguang
收藏  |  浏览/下载:1/0  |  提交时间:2024/05/20
Language-based person search  knowledge enhancement  knowledge enhancement  cross-modal knowledge transfer  intra-modal knowledge transfer  intra-modal knowledge transfer  knowledge refiner  
Focus and Align: Learning Tube Tokens for Video-Language Pre-Training 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 卷号: 25, 页码: 8036-8050
作者:  Zhu, Yongqing;  Li, Xiangyang;  Zheng, Mao;  Yang, Jiahao;  Wang, Zihan;  Guo, Xiaoqian;  Chai, Zifeng;  Yuan, Yuchen;  Jiang, Shuqiang
收藏  |  浏览/下载:2/0  |  提交时间:2024/05/20
Electron tubes  Semantics  Visualization  Feature extraction  Task analysis  Transformers  Detectors  Local alignment mechanism  semantic centers  tube tokens  video-language pre-training  
Know More Say Less: Image Captioning Based on Scene Graphs 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 卷号: 21, 期号: 8, 页码: 2117-2130
作者:  Li, Xiangyang;  Jiang, Shuqiang
收藏  |  浏览/下载:76/0  |  提交时间:2019/12/10
Image captioning  scene graph  relationship  long short-term network  attention mechanism  vision-language  
Bundled Object Context for Referring Expressions 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 卷号: 20, 期号: 10, 页码: 2749-2760
作者:  Li, Xiangyang;  Jiang, Shuqiang
收藏  |  浏览/下载:53/0  |  提交时间:2019/12/10
Bundled object context  referring expression  LSTM  vision-language  
GLA: Global-Local Attention for Image Description 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 卷号: 20, 期号: 3, 页码: 726-737
作者:  Li, Linghui;  Tang, Sheng;  Zhang, Yongdong;  Deng, Lixi;  Tian, Qi
收藏  |  浏览/下载:59/0  |  提交时间:2019/12/10
Convolutional neural network  recurrent neural network  image description  natural language processing