Institute of Computing Technology, Chinese Academy IR
FZeroTC: fully zero-shot text classification for simultaneously discovering and labeling unseen classes | |
Duan, Dongsheng1; Lv, Cunchi2,3; Zhang, Cheng2; Hou, Wei1; Shi, Lei1; Li, Yangxi1; Zhao, Xiaofang2,3 | |
2025-04-21 | |
发表期刊 | KNOWLEDGE AND INFORMATION SYSTEMS
![]() |
ISSN | 0219-1377 |
页码 | 25 |
摘要 | With the explosive data growth on the web, there are massive textual data without class labels such that zero-shot text classification has attracted much research attention. However, existing zero-shot text classification models still take the class labels as the weakly supervised signal, which are usually unavailable in the open domain. In this paper, we study the text classification problem in a fully zero-shot setting, in which not only are we not given any training samples for unseen classes, but also the label names and the total number of unseen classes are unknown. We propose a fully zero-shot text classification model (FZeroTC) in a semi-supervised learning framework to simultaneously discover and label unseen classes. In the FZeroTC model, a pairwise loss and a Kullback-Leibler divergence-based regularization term are specially designed for unseen class discovery, and a faraway loss is specially designed for class labeling. We propose three different kinds of learning strategies based on the pretrained language model and prompt learning to train FZeroTC. From extensive experiments on four public text classification datasets, FZeroTC outperforms the state-of-the-art zero-shot text classification models in terms of unseen class discovery performance and can provide high-quality labels for unseen classes. |
关键词 | Zero-shot text classification Semi-supervised learning Pretrained language model Prompt learning Class discovery Class labeling |
DOI | 10.1007/s10115-025-02379-5 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | National Natural Science Foundation of China[62272125] ; National Natural Science Foundation of China[62192785] |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Artificial Intelligence ; Computer Science, Information Systems |
WOS记录号 | WOS:001471214800001 |
出版者 | SPRINGER LONDON LTD |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/40603 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Duan, Dongsheng; Zhang, Cheng |
作者单位 | 1.Coordinat Ctr China, Natl Comp Network Emergency Response Tech Team, Beijing 100029, Peoples R China 2.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China 3.Univ Chinese Acad Sci, Beijing 100049, Peoples R China |
推荐引用方式 GB/T 7714 | Duan, Dongsheng,Lv, Cunchi,Zhang, Cheng,et al. FZeroTC: fully zero-shot text classification for simultaneously discovering and labeling unseen classes[J]. KNOWLEDGE AND INFORMATION SYSTEMS,2025:25. |
APA | Duan, Dongsheng.,Lv, Cunchi.,Zhang, Cheng.,Hou, Wei.,Shi, Lei.,...&Zhao, Xiaofang.(2025).FZeroTC: fully zero-shot text classification for simultaneously discovering and labeling unseen classes.KNOWLEDGE AND INFORMATION SYSTEMS,25. |
MLA | Duan, Dongsheng,et al."FZeroTC: fully zero-shot text classification for simultaneously discovering and labeling unseen classes".KNOWLEDGE AND INFORMATION SYSTEMS (2025):25. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论