首页 | 本学科首页   官方微博 | 高级检索  
 共查询到18条相似文献,搜索用时 203 毫秒
相似度计算是自动问答领域里的重要内容。为了保证候选答案集中各答案能具备合理的排序,解决传统自动问答系统不能高效的综合评价相似度问题,提出利用综合指数法对关键词相似度、语义相似度等进行综合评价,得到综合相似度。并针对部分候选答案冗余信息过多,不利于答案提取的情况,设计了衰减相似度参数,用来解决句子冗余信息对答案提取的影响。实验结果表明,综合指数法的相似度算法能够有效的提高问答的正确率。  相似文献   

基于Internet的自动问答系统研究   总被引:1,自引:0,他引:1  
盛秋艳 《现代情报》2005,25(4):81-82
自动问答技术是自然语言处理领域中一个非常热门的研究方向,它综合运用了各种自然语言处理技术。本文介绍了自动问答技术的发展现状和自动问答系统中常用的技术。自动问答系统一般包括三个主要组成部分:问题分析、信息检索和答案抽取。  相似文献   

王日花 《情报科学》2021,39(10):76-87
【目的/意义】解决自动问答系统构建过程中数据集构建成本高的问题,以及自动问答过程中仅考虑问题或 答案本身相关性的局限。【方法/过程】提出了一种融合标注问答库和社区问答数据的数据集构建方法,构建问题关 键词-问题-答案-答案簇多层异构网络模型,并给出了基于该模型的自动问答算法。获取图书馆语料进行处理作 为实验数据,将BERT-Cos、AINN、BiMPM模型作为对比对象进行了实验与分析。【结果/结论】通过实验得到了各 模型在图书馆自动问答任务上的效果,本文所提模型在各评价指标上均优于其他模型,模型准确率达87.85%。【创 新/局限】本文提出的多数据源融合数据集构建方法和自动问答模型在问答任务中相对于已有方法具有更好的表 现,同时根据模型效果分析给出用户提问词长建议。  相似文献   

介绍了垮语言自动问答系统及其实现模式和实现原理,并对其涉及的答案类型预测、翻译、信息检索、答案提取等相关方法与技术进行分析.对比阐述CLEF、NTCIR会议对此类系统的评价内容及方法.最后,介绍系统的运用.  相似文献   

随着网络的快速发展,人们对搜索引擎的依赖越来越强。传统搜索引擎仅基于关键字,而问答系统能够快速、准确地获取用户所需信息,是新一代搜索引擎。问答系统允许用户使用自然语言提问,能够准确返回用户所需答案。问答系统一般由问题理解、信息检索、答案抽取三部分组成。本文介绍了问题理解中的问题类型分类技术,并给出了具体实现。  相似文献   

随着人工智能的发展,智能问答系统逐渐成为研究的热点,得到了越来越多研究者的关注。藏文问答系统不同于中英文等主流语种的问答系统,没有大量的结构化数据以支撑问答系统丰富全面的知识库引擎。本研究通过着力于面向小学藏语文课本数据领域的问答数据资源,通过规则筛选、人工校正、问句意图及相似度标注,构建了一个高质量的藏文问答数据集。经自动评价和实验验证,该数据集的问句和答复句具有较好的知识关联度,采用三分制的人工评价结果显示98%的数据样本符合小学生认知和藏文文语法规则,且问答对句子流畅、问题与答案相关性较高。通过Bert融合提取词和不融合提取词进行了意图分类和tf-idf+Bert相似度计算,分类结果准确率分别在75%和76%,相似度准确率在76%,这也验证了所构建面向小学藏语文课程知识问答语料库的有效性。  相似文献   

【目的/意义】对Google、Bing、百度和搜狗四个中外文搜索引擎的自然语言问答能力进行评价,以揭示搜 索引擎正在向兼具搜索和自动问答功能的系统演进的趋势,对不同搜索引擎在不同类型问题上的自然语言回答能 力进行比较。【方法/过程】从文本检索会议和自然语言处理与中文计算会议的问答系统评测项目抽取了三类问题 (人物类、时间类、地点类),并进行搜索,以搜索引擎是否返回准确答案或包含正确答案的精选摘要为标准进行人 工评分,使用单因素方差分析和多重比较检验的方法进行比较分析。【结果/结论】主流的中外文搜索引擎均已具备 一定的自然语言问答能力,但仍存在较大的提升空间。Google总体表现最好,但对于人物类问题的回答能力弱于 搜狗。中外文搜索引擎在时间类问题上的表现均好于人物类和地点类问题。  相似文献   

相对于传统的产品领域意见挖掘研究,文章对中文通用领域的意见挖掘各部分内容进行了尝试性研究。利用基于多种语言特征和候选评价对象的条件随机场模型进行观点表达抽取,对有窗口限制的最近邻方法进行改进,提出一种评价对象—观点表达对的匹配算法,其对评价对象抽取效果也进行了进一步的修正。  相似文献   

正本技术是一种基于自动生成知识库的智能问答系统。利用爬虫知识从网页爬取有用信息作为QA对;通过关键词匹配的算法改进制作推理机;在生成知识库时,将QA对的答案提取出关键词,并存储到知识库中,作为用户匹配QA对的关键数据。若网页中有描述即直接取,没有描述则用算法  相似文献   

"新浪爱问"和"百度知道"这类问答服务系统的主要任务之一是对问题进行分类,以便于组织用户产生的问题数据,并进行进一步的分析处理。问答服务系统的实际应用需求对问题分类算法在分类效果、计算复杂度以及对噪声数据敏感度等方面提出了较高的要求。基于信息检索思想,本文提出一种基于类文档排名的分类算法,并从语言模型的角度对该算法进行分析和改进。通过在一个大尺度的问题数据集合进行的一系列实验,表明本文提出的算法在问题分类任务中可以取得优于传统算法的分类效果;同时,该算法计算量较小,适用于处理大规模数据,可以很好的满足问答服务系统中对于问题分类算法的要求。  相似文献   

Question answering (QA) aims at finding exact answers to a user’s question from a large collection of documents. Most QA systems combine information retrieval with extraction techniques to identify a set of likely candidates and then utilize some ranking strategy to generate the final answers. This ranking process can be challenging, as it entails identifying the relevant answers amongst many irrelevant ones. This is more challenging in multi-strategy QA, in which multiple answering agents are used to extract answer candidates. As answer candidates come from different agents with different score distributions, how to merge answer candidates plays an important role in answer ranking. In this paper, we propose a unified probabilistic framework which combines multiple evidence to address challenges in answer ranking and answer merging. The hypotheses of the paper are that: (1) the framework effectively combines multiple evidence for identifying answer relevance and their correlation in answer ranking, (2) the framework supports answer merging on answer candidates returned by multiple extraction techniques, (3) the framework can support list questions as well as factoid questions, (4) the framework can be easily applied to a different QA system, and (5) the framework significantly improves performance of a QA system. An extensive set of experiments was done to support our hypotheses and demonstrate the effectiveness of the framework. All of the work substantially extends the preliminary research in Ko et al. (2007a). A probabilistic framework for answer selection in question answering. In: Proceedings of NAACL/HLT.  相似文献   

Question answering systems assist users in satisfying their information needs more precisely by providing focused responses to their questions. Among the various systems developed for such a purpose, community-based question answering has recently received researchers’ attention due to the large amount of user-generated questions and answers in social question-and-answer platforms. Reusing such data sources requires an accurate information retrieval component enhanced by a question classifier. The question classification gives the system the possibility to have information about question categories to focus on questions and answers from relevant categories to the input question. In this paper, we propose a new method based on unsupervised Latent Dirichlet Allocation for classifying questions in community-based question answering. Our method first uses unsupervised topic modeling to extract topics from a large amount of unlabeled data. The learned topics are then used in the training phase to find their association with the available category labels in the training data. The category mixture of topics is finally used to predict the label of unseen data.  相似文献   

【目的/意义】旨在将社会化问答社区中碎片化的答案关联起来,并为用户提供不同主题的高质量答案和更 好的知识服务。【方法/过程】首先,本研究利用Doc2vec算法计算答案之间的语义相似度,并构建答案语义网络。其 次,利用Louvain算法对答案语义网络进行社区划分,并用TextRank算法抽取各个主题下文档的关键词,使用词云 对每个主题进行可视化展示。最后,利用PageRank算法对聚类后的答案语义网络进行排序,从而实现答案文档的 主题聚合和排序。【结果/结论】本研究使用“知乎”上的问答数据进行了实证研究。结果表明,所提出的答案聚合和 排序方法不仅能够向用户直观地展示答案之间的关联强度和各个主题答案的主要内容,还能够为用户提供分主题 的答案排序结果,自动为用户筛选高质量的答案。【创新/局限】创新性地提出了答案语义网络,并基于答案语义网 络,提出了一种集聚合、主题可视化和排序于一体的答案知识组织方法。  相似文献   

Question answering (QA) is the task of automatically answering a question posed in natural language. Currently, there exists several QA approaches, and, according to recent evaluation results, most of them are complementary. That is, different systems are relevant for different kinds of questions. Somehow, this fact indicates that a pertinent combination of various systems should allow to improve the individual results. This paper focuses on this problem, namely, the selection of the correct answer from a given set of responses corresponding to different QA systems. In particular, it proposes a supervised multi-stream approach that decides about the correctness of answers based on a set of features that describe: (i) the compatibility between question and answer types, (ii) the redundancy of answers across streams, as well as (iii) the overlap and non-overlap information between the question–answer pair and the support text. Experimental results are encouraging; evaluated over a set of 190 questions in Spanish and using answers from 17 different QA systems, our multi-stream QA approach could reach an estimated QA performance of 0.74, significantly outperforming the estimated performance from the best individual system (0.53) as well as the result from best traditional multi-stream QA approach (0.60).  相似文献   

Existing approaches in online health question answering (HQA) communities to identify the quality of answers either address it subjectively by human assessment or mainly using textual features. This process may be time-consuming and lose the semantic information of answers. We present an automatic approach for predicting answer quality that combines sentence-level semantics with textual and non-textual features in the context of online healthcare. First, we extend the knowledge adoption model (KAM) theory to obtain the six dimensions of quality measures for textual and non-textual features. Then we apply the Bidirectional Encoder Representations from Transformers (BERT) model for extracting semantic features. Next, the multi-dimensional features are processed for dimensionality reduction using linear discriminant analysis (LDA). Finally, we incorporate the preprocessed features into the proposed BK-XGBoost method to automatically predict the answer quality. The proposed method is validated on a real-world dataset with 48121 question-answer pairs crawled from the most popular online HQA communities in China. The experimental results indicate that our method competes against the baseline models on various evaluation metrics. We found up to 2.9% and 5.7% improvement in AUC value in comparison with BERT and XGBoost models respectively.  相似文献   

We propose answer extraction and ranking strategies for definitional question answering using linguistic features and definition terminology. A passage expansion technique based on simple anaphora resolution is introduced to retrieve more informative sentences, and a phrase extraction method based on syntactic information of the sentences is proposed to generate a more concise answer. In order to rank the phrases, we use several evidences including external definitions and definition terminology. Although external definitions are useful, it is obvious that they cannot cover all the possible targets. The definition terminology score which reflects how the phrase is definition-like is devised to assist the incomplete external definitions. Experimental results show that the proposed answer extraction and ranking method are effective and also show that our proposed system is comparable to state-of-the-art systems.  相似文献   

This article addresses the issue of extracting contexts and answers of questions from posts of online discussion forums. In previous work, general-purpose graphical models have been employed without any customization to this specific extraction problem. Instead, in this article, we propose a unified approach to context and answer extraction by customizing the structural support vector machine method. The customization enables our proposal to explore various relations among sentences of posts and complex structures of threads. We design new inference algorithms to find or approximate the most violated constraint by utilizing the specific structure of forum threads, which enables us to efficiently find the global optimum of the customized optimizing problem. We also optimize practical performance measures by varying loss functions. Experimental results show that our methods are both promising and flexible.  相似文献   

With the advances in natural language processing (NLP) techniques and the need to deliver more fine-grained information or answers than a set of documents, various QA techniques have been developed corresponding to different question and answer types. A comprehensive QA system must be able to incorporate individual QA techniques as they are developed and integrate their functionality to maximize the system’s overall capability in handling increasingly diverse types of questions. To this end, a new QA method was developed to learn strategies for determining module invocation sequences and boosting answer weights for different types of questions. In this article, we examine the roles and effects of the answer verification and weight boosting method, which is the main core of the automatically generated strategy-driven QA framework, in comparison with a strategy-less, straightforward answer-merging approach and a strategy-driven but with manually constructed strategies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号