基于相似网页文本演化的数据溯源 Derivation of Similar Web Text and Data Provenance期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于相似网页文本演化的数据溯源

引用本文：	倪静,孟宪学.基于相似网页文本演化的数据溯源[J].图书情报工作,2016,60(13):134.

作者姓名：	倪静孟宪学

作者单位：	1. 北京石油化工学院经济管理学院北京 102617;2. 中国农业科学院农业信息研究所北京 100081

基金项目：	本文系北京市社会科学基金项目“社交网络中谣言的数据溯源与监控对策”（项目编号：14SHB010）和教育部人文社会科学研究规划基金项目“社交网络舆情演化的数据溯源及信任机制研究”（项目编号：15YJAZH052）研究成果之一。

摘要：	目的/意义] 为解决现有网页文本缺乏起源标注的问题，提出一种借助PROV本体发现相似网页文本起源关系的方法。方法/过程] 通过聚类算法、自动语义标注和关联数据构建等技术的综合应用，结合PROV-POL溯源模型，检测网页文本实体的演变过程，实现文本级和属性级两级溯源方案。结果/结论] 实验验证了借助语义网技术和数据溯源模型实现网页文本数据溯源的可行性，但实验过程中聚类算法的召回率有待提高。
关键词：	PROV模型内容追溯关联数据
收稿时间：	2016-03-22
修稿时间：	2016-06-18
Derivation of Similar Web Text and Data Provenance

Ni Jing,Meng Xianxue.Derivation of Similar Web Text and Data Provenance[J].Library and Information Service,2016,60(13):134.

Authors:	Ni Jing Meng Xianxue

Institution:	1. Economic Management School, Beijing Institute of Petrochemical Technology, Beijing 102617;2. Agricultural Institute of Information, Chinese Academy of Agricultural Sciences, Beijing 100081

Abstract:	Purpose/significance] To solve the problem for lacking of provenance metadata in existing web page, we put forward a method of automatic annotation.Method/process] By clustering algorithm, automatic semantic annotation and linked data technology, combined with the PROV-POL data provenance model, the derivation of the Web page text entities are detected, through implementing the text level and attribute level data provenance structure.Result/conclusion] Tests show that the semantic web technology and PROV model used to get the data provenance of web page text is feasible. The recall rate of clustering algorithm we applied needs to be improved. This method has a promising practical value for Web provenance.

Keywords:	PROV model content provenance linked data

	点击此处可从《图书情报工作》浏览原始摘要信息
	点击此处可从《图书情报工作》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏