Institute of Computing Technology, Chinese Academy IR
A WORD POSITION-RELATED LDA MODEL | |
Zhai, Lidong1; Ding, Zhaoyun2; Jia, Yan2; Zhou, Bin2 | |
2011-09-01 | |
发表期刊 | INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE |
ISSN | 0218-0014 |
卷号 | 25期号:6页码:909-925 |
摘要 | LDA (Latent Dirichlet Allocation) proposed by Blei is a generative probabilistic model of a corpus, where documents are represented as random mixtures over latent topics, and each topic is characterized by a distribution over words, but not the attributes of word positions of every document in the corpus. In this paper, a Word Position-Related LDA Model is proposed taking into account the attributes of word positions of every document in the corpus, where each word is characterized by a distribution over word positions. At the same time, the precision of the topic-word's interpretability is improved by integrating the distribution of the word-position and the appropriate word degree, taking into account the different word degree in the different word positions. Finally, a new method, a size-aware word intrusion method is proposed to improve the ability of the topic-word's interpretability. Experimental results on the NIPS corpus show that the Word Position-Related LDA Model can improve the precision of the topic-word's interpretability. And the average improvement of the precision in the topic-word's interpretability is about 9.67%. Also, the size-aware word intrusion method can interpret the topic-word's semantic information more comprehensively and more effectively through comparing the different experimental data. |
关键词 | LDA probabilistic topic models word position word degree word intrusion |
DOI | 10.1142/S0218001411008890 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | Basic Research Program of China (973 Program)[2007CB311100] ; National Natural Science Foundation of China[61003261] ; National Natural Science Foundation of China[60933005] ; National Natural Science Foundation of China[60873204] ; National Natural Science Foundation of China[12505] ; National Natural Science Foundation of China[2011AA010702] |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Artificial Intelligence |
WOS记录号 | WOS:000295128400006 |
出版者 | WORLD SCIENTIFIC PUBL CO PTE LTD |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/13164 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Zhai, Lidong |
作者单位 | 1.Chinese Acad Sci, Inst Comp Technol, Res Ctr Informat Secur, Beijing, Peoples R China 2.Natl Univ Def Technol, Sch Comp, Changsha, Hunan, Peoples R China |
推荐引用方式 GB/T 7714 | Zhai, Lidong,Ding, Zhaoyun,Jia, Yan,et al. A WORD POSITION-RELATED LDA MODEL[J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE,2011,25(6):909-925. |
APA | Zhai, Lidong,Ding, Zhaoyun,Jia, Yan,&Zhou, Bin.(2011).A WORD POSITION-RELATED LDA MODEL.INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE,25(6),909-925. |
MLA | Zhai, Lidong,et al."A WORD POSITION-RELATED LDA MODEL".INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE 25.6(2011):909-925. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论