首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于机器学习的Web链接的抽取
引用本文:朱红灿,邹凯.基于机器学习的Web链接的抽取[J].情报理论与实践,2007,30(2):252-255.
作者姓名:朱红灿  邹凯
作者单位:湘潭大学,管理学院,湖南,湘潭,411105
摘    要:互联网网页是通过超链接连接起来的,为人们的日常生活和商务用途提供了非常丰富的信息资源。链接结构分析在万维网的很多研究领域发挥着越来越重要的作用。然而存在着许多与主题无关的链接,造成了主题漂移。本文分析了链接本身的特点,介绍了一种有监督机器学习方法自动地抽取网页中的相关链接。试验结果表明该算法具有实用的价值。

关 键 词:机器学习  链接抽取  主题漂移  贝叶斯算法
修稿时间:2006-11-03

Web Linkage Extraction Based on Machine Learning
Zhu Hongcan et al..Web Linkage Extraction Based on Machine Learning[J].Information Studies:Theory & Application,2007,30(2):252-255.
Authors:Zhu Hongcan
Institution:Zhu Hongcan et al.
Abstract:The hyperlinked Web pages on the Internet provide very rich information resources for daily life and commercial use. The Web linkage analysis is playing a more and more important role in the researches on the World Wide Web. However, there are a lot of linkages irrelative to topics, which lead to topic drift. In this paper, a new method of Web linkage extraction is proposed by analyzing the features of the linkage anchors as supervised machine learning task. The experiment shows that this algorithm is of practical value.
Keywords:machine learning  linkage extraction  topic drift  Bayesian algorithm
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号