首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 779 毫秒
1.
范少萍  郑春厚  王娟 《情报科学》2012,(2):196-199,205
利用网格技术与语义网技术,结合知识网格和文本资源的特点,在知识网格环境下研究了文本分类问题。首先分析了知识网格环境下文本资源要进行合理有效的分类需要解决的关键问题,并以此为基础,构建了知识网格环境下的文本分类模式。该模式主要包括:语义互联模块、元样本集成模块、文本动态更新模块、文本分类模块。此模式可以对后续在知识网格环境下研究文本分类能有所指导与借鉴。  相似文献   

2.
文本挖掘技术及其在专利信息分析中的应用   总被引:1,自引:0,他引:1  
张群 《现代情报》2006,26(3):209-210,213
本文介绍了文本挖掘概念、主要技术及其一般过程,阐述了文本挖掘在专利信息分析中的应用,以及专利信息分析中具体应用的三个文本挖掘工具:Intelligent Miner for Text、ThemeScape、VantagePoint。  相似文献   

3.
知识分类是企业实施知识管理要解决的重要问题,但当前已有的知识分类方法难以实现知识管理与企业业务管理的有效集成.基于知识模式的企业文本知识分类方法面向企业业务流程、角色以及组织机构等管理对象,利用知识模式描述企业的文本知识和管理对象,先从文本知识中提取知识模式与元知识进行匹配,再将形成的企业知识模式与管理对象的知识模式进行匹配,把匹配结果排序后就形成最终知识分类结果.实验证明,这种方法具有较高的实操性,分类的准确性也能满足实际应用的要求.  相似文献   

4.
宁琳 《现代情报》2016,36(2):140
文本挖掘是数据挖掘技术的一个重要方面,本文根据句法规则的特征,利用文本挖掘技术,提出基于句法规则的文本知识挖掘设计模型,从数据准备、句法规则构造、文本预处理、文本知识挖掘、挖掘结果评价等方面对工作原理进行了分析,重点阐述了句法规则的构造过程,最后通过实验验证了该模型,该设计对实现文本知识的智能化挖掘具有一定的研究意义和应用价值。  相似文献   

5.
基于SOM聚类的文本挖掘知识展现可视化研究   总被引:1,自引:0,他引:1  
本文旨在以可视化的知识地图展现防务快讯文本挖掘下的挖掘结果,为情报工作者获取知识提供方便.当前,文本挖掘的可视化展现在方法和技术上都是一个难点,本文尝试在文本挖掘系统中引入SOM神经网络算法,该方法在知识可视化方面效果比较突出,配合国防词汇本体非常清晰的层次结构,能够很好地将文本挖掘系统采集到的防务信息聚合成有序的知识并以色块图的形式展现给用户.实验结果表明这种方法聚类结果准确,可视化展示界面简单明了,方便用户了解热点问题、获取知识,便于支持决策.  相似文献   

6.
王燕  温有奎 《情报理论与实践》2007,30(3):409-411,362
本文阐述了文本单元向知识单元转化的必要性和过程。利用本体技术构造知识单元,采用本体描述语言实现对知识的本体表示,阐述了OTKTS系统的两个主要模块。  相似文献   

7.
文本分析(也称“内容分析”)是各国情报部门和科研人员广泛采用的一种分析文本的方法。文本分析通过将定性的,半结构性的文本编码,使定性的文本可以用定量的方法来分析,从而大幅提高分析的可靠性。通过考察文本分析方法的独特优势和基本步骤,探索该方法在竞争情报分析中的应用。文本分析方法在战略集团分析、竞争对手假设分析、竞争对手目标分析、竞争对手战略分析和竞争对手使命分析等竞争情报领域都有独特的优势。  相似文献   

8.
陈辉 《中国科技信息》2004,27(19):32-33
在对互联网上海量文本信息进行管理的过程中,文本自动分类是一项关键且基础的技术。本文主要介绍了文本分类的概念、实施过程.相关技术以及文本分类在网络信息服务中的几个典型用途。  相似文献   

9.
近几年来,随着互联网的迅速发展,微博系统的用户日趋增多。随着计算机犯罪活动的日益猖獗,人们对于网络与系统安全展开了大量研究,但对于网上媒体信息内容的安全问题,只是在近年来才逐渐得以重视。因此,针对这一重大问题,本文结合自然语言理解、中文信息处理等学科的相关知识,通过分析各类不良信息的特征,结合本系统中的实验,研究了不良文本信息处理的进展情况,研究了适合不良文本信息过滤的概念网分析模型、过滤算法等。  相似文献   

10.
基于潜在语义索引的文本结构分析方法的研究   总被引:4,自引:0,他引:4  
文本结构分析是文本处理领域中的重要内容,它可以有效地改进文本检索、文本过滤以及文本摘要的精度。通过描述文本的物理结构和逻辑结构以及文本分析的背景,将潜在语义索引引入文本结构分析中,提出了基于潜在语义索引的层次分析方法,该方法保证了层次划分的有序性和聚合性,可操作性强,便于解释,并给出了在文本检索、文本过滤和文本摘要中的应用。  相似文献   

11.
Natural Language Processing (NLP) techniques have been successfully used to automatically extract information from unstructured text through a detailed analysis of their content, often to satisfy particular information needs. In this paper, an automatic concept map construction technique, Fuzzy Association Concept Mapping (FACM), is proposed for the conversion of abstracted short texts into concept maps. The approach consists of a linguistic module and a recommendation module. The linguistic module is a text mining method that does not require the use to have any prior knowledge about using NLP techniques. It incorporates rule-based reasoning (RBR) and case based reasoning (CBR) for anaphoric resolution. It aims at extracting the propositions in text so as to construct a concept map automatically. The recommendation module is arrived at by adopting fuzzy set theories. It is an interactive process which provides suggestions of propositions for further human refinement of the automatically generated concept maps. The suggested propositions are relationships among the concepts which are not explicitly found in the paragraphs. This technique helps to stimulate individual reflection and generate new knowledge. Evaluation was carried out by using the Science Citation Index (SCI) abstract database and CNET News as test data, which are well known databases and the quality of the text is assured. Experimental results show that the automatically generated concept maps conform to the outputs generated manually by domain experts, since the degree of difference between them is proportionally small. The method provides users with the ability to convert scientific and short texts into a structured format which can be easily processed by computer. Moreover, it provides knowledge workers with extra time to re-think their written text and to view their knowledge from another angle.  相似文献   

12.
13.
企业内部知识市场交易机制的模型建构与研究   总被引:14,自引:0,他引:14  
戴俊  盛昭瀚 《预测》2004,23(4):48-51,19
基于Davenport企业内部知识市场理论,分析了企业内部知识运行的交易特征,指出知识运行的本质是知识的交易,以Holstrom和Milgrom的委托代理模型作为定量分析工具,引入知识交易意愿度的概念,对模型进行了改造,构建了需求拉动式和供给推动式的二种知识交易模型,在此基础上对二种知识交易模式进行了比较,并提出了模型的管理学意义。  相似文献   

14.
基于网上新闻语料的Web页面自动分类研究   总被引:1,自引:0,他引:1  
Web页面由于其在表达信息的丰富性方面远胜于纯文本文件,因此Web页面分类与纯文本分类不同。针对网上中文新闻页面特点,我们提出了一种无需词典的从Web页面中抽取主题的实用算法。并将提取出的类主题概念融入分类用知识库,然后用我们研究小组提出的混合分类算法进行分类,实验语料取自新华网财经新闻。实验结果表明:与不使用Web页面特征,仅用全文相比较,分类性能有所提高。  相似文献   

15.
结构化语言知识库在自然语言处理中的应用   总被引:1,自引:0,他引:1  
刘海涛 《情报科学》1992,13(5):44-49
本文讨论了知识在自然语言处理中的作用和现有系统的局限,引入一种全文结构知识库的概念并探讨了此知识库在自然语言处理中的诸多应用问题。  相似文献   

16.
针对图书、期刊论文等数字文献文本特征较少而导致特征向量语义表达不够准确、分类效果差的问题,本文提出一种基于特征语义扩展的数字文献分类方法。该方法首先利用TF-IDF方法获取对数字文献文本表示能力较强、具有较高TF-IDF值的核心特征词;其次分别借助知网(Hownet)语义词典以及开放知识库维基百科(Wikipedia)对核心特征词集进行语义概念的扩展,以构建维度较低、语义丰富的概念向量空间;最后采用MaxEnt、SVM等多种算法构造分类器实现对数字文献的自动分类。实验结果表明:相比传统基于特征选择的短文本分类方法,该方法能有效地实现对短文本特征的语义扩展,提高数字文献分类的分类性能。  相似文献   

17.
Automatic text summarization attempts to provide an effective solution to today’s unprecedented growth of textual data. This paper proposes an innovative graph-based text summarization framework for generic single and multi document summarization. The summarizer benefits from two well-established text semantic representation techniques; Semantic Role Labelling (SRL) and Explicit Semantic Analysis (ESA) as well as the constantly evolving collective human knowledge in Wikipedia. The SRL is used to achieve sentence semantic parsing whose word tokens are represented as a vector of weighted Wikipedia concepts using ESA method. The essence of the developed framework is to construct a unique concept graph representation underpinned by semantic role-based multi-node (under sentence level) vertices for summarization. We have empirically evaluated the summarization system using the standard publicly available dataset from Document Understanding Conference 2002 (DUC 2002). Experimental results indicate that the proposed summarizer outperforms all state-of-the-art related comparators in the single document summarization based on the ROUGE-1 and ROUGE-2 measures, while also ranking second in the ROUGE-1 and ROUGE-SU4 scores for the multi-document summarization. On the other hand, the testing also demonstrates the scalability of the system, i.e., varying the evaluation data size is shown to have little impact on the summarizer performance, particularly for the single document summarization task. In a nutshell, the findings demonstrate the power of the role-based and vectorial semantic representation when combined with the crowd-sourced knowledge base in Wikipedia.  相似文献   

18.
19.
Text mining techniques for patent analysis   总被引:1,自引:0,他引:1  
Patent documents contain important research results. However, they are lengthy and rich in technical terminology such that it takes a lot of human efforts for analyses. Automatic tools for assisting patent engineers or decision makers in patent analysis are in great demand. This paper describes a series of text mining techniques that conforms to the analytical process used by patent analysts. These techniques include text segmentation, summary extraction, feature selection, term association, cluster generation, topic identification, and information mapping. The issues of efficiency and effectiveness are considered in the design of these techniques. Some important features of the proposed methodology include a rigorous approach to verify the usefulness of segment extracts as the document surrogates, a corpus- and dictionary-free algorithm for keyphrase extraction, an efficient co-word analysis method that can be applied to large volume of patents, and an automatic procedure to create generic cluster titles for ease of result interpretation. Evaluation of these techniques was conducted. The results confirm that the machine-generated summaries do preserve more important content words than some other sections for classification. To demonstrate the feasibility, the proposed methodology was applied to a real-world patent set for domain analysis and mapping, which shows that our approach is more effective than existing classification systems. The attempt in this paper to automate the whole process not only helps create final patent maps for topic analyses, but also facilitates or improves other patent analysis tasks such as patent classification, organization, knowledge sharing, and prior art searches.  相似文献   

20.
We present an image retrieval framework based on automatic query expansion in a concept feature space by generalizing the vector space model of information retrieval. In this framework, images are represented by vectors of weighted concepts similar to the keyword-based representation used in text retrieval. To generate the concept vocabularies, a statistical model is built by utilizing Support Vector Machine (SVM)-based classification techniques. The images are represented as "bag of concepts" that comprise perceptually and/or semantically distinguishable color and texture patches from local image regions in a multi-dimensional feature space. To explore the correlation between the concepts and overcome the assumption of feature independence in this model, we propose query expansion techniques in the image domain from a new perspective based on both local and global analysis. For the local analysis, the correlations between the concepts based on the co-occurrence pattern, and the metrical constraints based on the neighborhood proximity between the concepts in encoded images, are analyzed by considering local feedback information. We also analyze the concept similarities in the collection as a whole in the form of a similarity thesaurus and propose an efficient query expansion based on the global analysis. The experimental results on a photographic collection of natural scenes and a biomedical database of different imaging modalities demonstrate the effectiveness of the proposed framework in terms of precision and recall.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号