首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于LSI的日地空间领域科学数据语义检索模型
作者姓名:刘春蔚  邹自明  佟继周
作者单位:1.中国科学院国家空间科学中心, 北京 100190;2.中国科学院大学, 北京 100049
基金项目:中国科学院信息化建设专项(XXH12504-08)和中国科学院战略性先导科技专项(XDA04080000)资助
摘    要:日地空间系统科学的数据具有体量大、种类多、结构复杂的特征,不同概念、不同事件之间的相互关联为该领域内的科学数据检索提出了很高的要求.然而目前该领域内依然以基于传统的关键词检索技术为主,严重影响检索结果的质量?提出一种数据语义检索模型,它是在对日地空间学科元信息提取的基础上,使用文本处理的方法将提取信息转换为词项-文档矩阵,进一步使用潜在语义索引技术对其进行分析,计算出检索条目与不同数据集的语义相关度,从而根据语义相关度向用户推荐科学数据.实验对比表明,该模型的召回率明显优于传统方法,且具有很高的准确率.该模型同时支持对科学数据进行语义标注和关键词提取,亦可用于其他领域科学数据检索.

关 键 词:日地空间  科学数据  语义检索  浅层语义索引  元数据  
收稿时间:2016-01-07
修稿时间:2016-04-01

LSI-based semantic retrieval model for scientific data in solar-terrestrial space field
Authors:LIU Chunwei  ZOU Ziming  TONG Jizhou
Institution:1.National Space Science Center, Chinese Academy of Sciences, Beijing 100190, China;2.University of Chinese Academy of Sciences, Beijing 100049, China
Abstract:The scientific data of solar-terrestrial space science has huge volume, wide variety, and complex structure. The correlations between different domain concepts and astro-events put forward high requirements of the scientific data retrieval in this field. However, the scientific data retrieval modules on the mainstream data share and publishing systems in this field are still built on the conventional keyword-based retrieval method. We present a semantic retrieval approach for the solar-terrestrial space system scientific data. Based on the semantic information extracted from scientific metadata of each scientific dataset, we get the TF-idf matrix using traditional text processing methods. Then latent semantic indexing further analyzes this matrix, and a similarity value is obtained to rank the relevance of a result to its search request. The experimental results show that the approach has a higher recall rate than conventional methods and maintains a high precision. This approach can be applied in other disciplines as well.
Keywords:solar-terrestrial space                                                                                                                        scientific data                                                                                                                        semantic retrieval                                                                                                                        LSI                                                                                                                        metadata
本文献已被 CNKI 等数据库收录!
点击此处可从《》浏览原始摘要信息
点击此处可从《》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号