CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
Bcgn: BLIP-based cross-modal grasping network for language-conditioned robotic grasping
Xu, Kai1,2; Wang, Lichun1; Li, Shuang3; Xin, Jianjia4; Yin, Baocai1
2025-10-13
发表期刊MULTIMEDIA SYSTEMS
ISSN0942-4962
卷号31期号:6页码:12
摘要The performance of robots on the language-conditioned robotic grasping task reflects the intelligence level of robots. However, existing approaches lack the ability to handle implicit instructions and identify infeasible ones, which undermines the intelligence and operational safety of the robot. To overcome the above limitations, this paper introduces a novel Language-conditioned Robotic Grasping Dataset (LRGD), which covers a variety of instruction types. Correspondingly, an end-to-end BLIP-based Cross-modal Grasping Network (BCGN) for language-conditioned grasping is proposed. Specifically, BCGN integrates BLIP to jointly model cross-modal information, and introduces a learnable circuit breaker that enables the model to actively reject infeasible requests. Furthermore, through collaboration with LVLMs (Large Vision-Language Models), BCGN can easily achieve zero-shot recognition of implicit instructions. Experimental results the LRGD and in real-world scenarios demonstrate the effectiveness of BCGN in dealing with instructions of different complexity levels.
关键词Robotic grasp Language-conditioned grasping Grasping dataset Cross-modal fusion
DOI10.1007/s00530-025-02005-y
收录类别SCI
语种英语
资助项目National Natural Science Foundation of China[2021ZD0111902] ; National Key R&D Program of China[62376014] ; National Key R&D Program of China[62172022] ; National Key R&D Program of China[U21B2038] ; National Natural Science Foundation of China[2021JQR023] ; Foundation for China University Industry-University-Research Innovation[KM202411232017] ; R&D Program of Beijing Municipal Education Commission
WOS研究方向Computer Science
WOS类目Computer Science, Information Systems ; Computer Science, Theory & Methods
WOS记录号WOS:001592913400013
出版者SPRINGER
引用统计
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/41657
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Xu, Kai; Wang, Lichun
作者单位1.Beijing Univ Technol, Sch Informat Sci & Technol, Beijing 100124, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
3.Beijing Informat Sci & Technol Univ, Sch Automat, Beijing 100192, Peoples R China
4.INSPUR Grp CO LTD, Jinan 250101, Peoples R China
推荐引用方式
GB/T 7714
Xu, Kai,Wang, Lichun,Li, Shuang,et al. Bcgn: BLIP-based cross-modal grasping network for language-conditioned robotic grasping[J]. MULTIMEDIA SYSTEMS,2025,31(6):12.
APA Xu, Kai,Wang, Lichun,Li, Shuang,Xin, Jianjia,&Yin, Baocai.(2025).Bcgn: BLIP-based cross-modal grasping network for language-conditioned robotic grasping.MULTIMEDIA SYSTEMS,31(6),12.
MLA Xu, Kai,et al."Bcgn: BLIP-based cross-modal grasping network for language-conditioned robotic grasping".MULTIMEDIA SYSTEMS 31.6(2025):12.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Xu, Kai]的文章
[Wang, Lichun]的文章
[Li, Shuang]的文章
百度学术
百度学术中相似的文章
[Xu, Kai]的文章
[Wang, Lichun]的文章
[Li, Shuang]的文章
必应学术
必应学术中相似的文章
[Xu, Kai]的文章
[Wang, Lichun]的文章
[Li, Shuang]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。