首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于标题的中文新闻网页自动分类
引用本文:钱爱兵,江岚.基于标题的中文新闻网页自动分类[J].现代图书情报技术,2008,24(10):59-68.
作者姓名:钱爱兵  江岚
作者单位:1. 南京中医药大学经贸管理学院,南京,210046
2. 南京大学信息管理系,南京,210093
摘    要:借鉴tf-idf加权思想,利用新闻标题来做中文新闻网页自动分类的依据,构建基于标题的中文新闻自动分类方法,并设计多个实验对各种基于标题的中文新闻网页自动分类方法进行评测。实验结果表明,基于标题对中文新闻网页进行自动分类,可以大大缩短判断处理时间,节省存储空间,且准确率较高,特别是改进的类目加权法分类效果最好。

关 键 词:词频/逆文档频率  新闻标题  中文新闻网页  自动分类
收稿时间:2008-07-02
修稿时间:2008-07-23

Automatic Classification Based on News Titles for Chinese News Web Pages
Qian Aibing,Jiang Lan.Automatic Classification Based on News Titles for Chinese News Web Pages[J].New Technology of Library and Information Service,2008,24(10):59-68.
Authors:Qian Aibing  Jiang Lan
Institution:(School of Economy and Commercial Management, Nanjing University of Chinese Medicine, Nanjing 210046, China) (Department of Information Management, Nanjing University, Nanjing 210093, China)
Abstract:This paper describes automatic Chinese news Web pages classification by using news title based on tf-idf weighting scheme,and constructs correlation degree of news title which determines appropriate category for each news Web page. The performance of this proposed method is evaluated in terms of top one score,top two score,and top three score. The experimental evaluation demonstrates that improved tf-idf weighting scheme with categories provides high accuracy with the classification of Chinese news Web pages.
Keywords:tf-idf News title Chinese news Web pages Automatic classification
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《现代图书情报技术》浏览原始摘要信息
点击此处可从《现代图书情报技术》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号