基于词汇同现的多用户兴趣本体构建研究 Research on the Construction of the Multi-user Interest Ontology Based on Word Co-occurrence期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于词汇同现的多用户兴趣本体构建研究

引用本文：	唐晓波,肖璐.基于词汇同现的多用户兴趣本体构建研究[J].情报理论与实践,2012,35(5):99-102.

作者姓名：	唐晓波肖璐

作者单位：	武汉大学信息资源研究中心,湖北武汉,430072

基金项目：	教育部人文社会科学重点研究基地重大项目“面向决策的企业信息资源集成研究”，教育部人文社会科学研究项目“企业信息资源集成研究”

摘要：	用户兴趣本体弥补了基于关键词的用户兴趣模型不能从语义上表达用户兴趣的缺陷,但大多是利用领域本体来构建,很难反映用户多方面和潜在兴趣,并且构建领域本体也是一个难点。本文据此提出一种基于词汇同现的用户兴趣本体构建方法。根据网页浏览记录找到用户兴趣网页集,经过数据处理将其转换成用户兴趣文本集。以TFIDF为指标抽取概念,词汇同现统计提取概念间关系,运用无尺度K-中心点聚类算法对其调整,将有关联用户的本体合并得到多用户本体,该本体能在语义上更全面反映用户兴趣并发现潜在兴趣。
关键词：	用户兴趣本体构建词汇同现
Research on the Construction of the Multi-user Interest Ontology Based on Word Co-occurrence

Institution:	Tang Xiaobo et al.

Abstract:	User interest ontology can make up the deficiencies of the Keyword-based user interest model that can not express the user interest from semantics.However,in most cases,we use the domain ontology to construct the user interest,and it’s difficult to reflect the user interest in various aspects and the potential interest.Furthermore,the construction of the domain ontology is also a challenge.Therefore,this paper proposes a method of constructing the user interest ontology based on word co-occurrence.We find the user interest sets of Web pages from the Web page browsing records and convert them into the user interest sets of text through data processing.Then we extract the concepts by taking TFIDF as the index,and extract the relationships between concepts by word co-occurrence statistics.Finally,the scale-free K-central point clustering algorithm is used to adjust the ontology.By merging the ontology of relevance users,we can find multi-user ontology.The method can reflect the user interest from semantics more completely and can help identify the potential interest.

Keywords:	user interest ontology architecture word co-occurrence
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏