CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
A neural topic model with word vectors and entity vectors for short texts
Zhao, Xiaowei1; Wang, Deqing1; Zhao, Zhengyang1; Liu, Wei2; Lu, Chenwei1; Zhuang, Fuzhen3,4
2021-03-01
发表期刊INFORMATION PROCESSING & MANAGEMENT
ISSN0306-4573
卷号58期号:2页码:11
摘要Traditional topic models are widely used for semantic discovery from long texts. However, they usually fail to mine high-quality topics from short texts (e.g. tweets) due to the sparsity of features and the lack of word co-occurrence patterns. In this paper, we propose a Variational Auto-Encoder Topic Model (VAETM for short) by combining word vector representation and entity vector representation to address the above limitations. Specifically, we first learn embedding representations of each word and each entity by employing a large-scale external corpora and a large and manually edited knowledge graph, respectively. Then we integrated the embedding representations into the variational auto-encoder framework and propose an unsupervised model named VAETM to infer the latent representation of topic distributions. To further boost VAETM, we propose an improved supervised VAETM (SVAETM for short) by considering label information in training set to supervise the inference of latent representation of topic distributions and the generation of topics. Last, we propose KL-divergence-based inference algorithms to infer approximate posterior distribution for our two models. Extensive experiments on three common short text datasets demonstrate our proposed VAETM and SVAETM outperform various kinds of state-of-the-art models in terms of perplexity, NPMI, and accuracy.
关键词Topic model Short text Variational auto-encoder Word embedding Entity embedding
DOI10.1016/j.ipm.2020.102455
收录类别SCI
语种英语
资助项目National Key R&D Program of China[2019YFA0707204] ; National Natural Science Foundation of China[U1836206]
WOS研究方向Computer Science ; Information Science & Library Science
WOS类目Computer Science, Information Systems ; Information Science & Library Science
WOS记录号WOS:000612229800005
出版者ELSEVIER SCI LTD
引用统计
被引频次:28[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/16198
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Zhuang, Fuzhen
作者单位1.Beihang Univ, Sch Comp Sci, Beijing 100191, Peoples R China
2.Coordinat Ctr China, Natl Comp Network Emergency Response Tech Team, Beijing 100029, Peoples R China
3.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, CAS, Beijing 100190, Peoples R China
4.Chinese Acad Sci, Xiamen Data Intelligence Acad ICT, Beijing, Peoples R China
推荐引用方式
GB/T 7714
Zhao, Xiaowei,Wang, Deqing,Zhao, Zhengyang,et al. A neural topic model with word vectors and entity vectors for short texts[J]. INFORMATION PROCESSING & MANAGEMENT,2021,58(2):11.
APA Zhao, Xiaowei,Wang, Deqing,Zhao, Zhengyang,Liu, Wei,Lu, Chenwei,&Zhuang, Fuzhen.(2021).A neural topic model with word vectors and entity vectors for short texts.INFORMATION PROCESSING & MANAGEMENT,58(2),11.
MLA Zhao, Xiaowei,et al."A neural topic model with word vectors and entity vectors for short texts".INFORMATION PROCESSING & MANAGEMENT 58.2(2021):11.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Zhao, Xiaowei]的文章
[Wang, Deqing]的文章
[Zhao, Zhengyang]的文章
百度学术
百度学术中相似的文章
[Zhao, Xiaowei]的文章
[Wang, Deqing]的文章
[Zhao, Zhengyang]的文章
必应学术
必应学术中相似的文章
[Zhao, Xiaowei]的文章
[Wang, Deqing]的文章
[Zhao, Zhengyang]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。