Institute of Computing Technology, Chinese Academy IR
Innovating web page classification through reducing noise | |
Li, XL; Shi, ZZ | |
2002 | |
发表期刊 | JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY |
ISSN | 1000-9000 |
卷号 | 17期号:1页码:9-17 |
摘要 | This paper presents a new method that eliminates noise in Web page classification. It first describes the presentation of a Web page based on HTML tags. Then through a novel distance formula, it eliminates the noise in similarity measure. After carefully analyzing Web pages, we design an algorithm that can distinguish related hyperlinks from noisy ones. We can utilize non-noisy hyperlinks to improve the performance of Web page classification (the CAWN algorithm). For any page, we can classify it through the text and category of neighbor pages related to the page. The experimental results show that our approach improved classification accuracy. |
关键词 | web page classification similarity measure classification algorithm without noise |
收录类别 | SCI |
语种 | 英语 |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Hardware & Architecture ; Computer Science, Software Engineering |
WOS记录号 | WOS:000173631200002 |
出版者 | SCIENCE CHINA PRESS |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/13542 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Li, XL |
作者单位 | 1.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100080, Peoples R China 2.Natl Univ Singapore, Sch Comp, Singapore 117543, Singapore |
推荐引用方式 GB/T 7714 | Li, XL,Shi, ZZ. Innovating web page classification through reducing noise[J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY,2002,17(1):9-17. |
APA | Li, XL,&Shi, ZZ.(2002).Innovating web page classification through reducing noise.JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY,17(1),9-17. |
MLA | Li, XL,et al."Innovating web page classification through reducing noise".JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 17.1(2002):9-17. |
条目包含的文件 | 条目无相关文件。 |
个性服务 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
谷歌学术 |
谷歌学术中相似的文章 |
[Li, XL]的文章 |
[Shi, ZZ]的文章 |
百度学术 |
百度学术中相似的文章 |
[Li, XL]的文章 |
[Shi, ZZ]的文章 |
必应学术 |
必应学术中相似的文章 |
[Li, XL]的文章 |
[Shi, ZZ]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论