首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
给出了在一个面向Agent、本体和关系的常识知识库中如何实现常识知识查询的过程。在查询过程中,用自然语言形式描述的查询先被转化成一种我们所定义的面向Agent和本体的描述性查询语言。接着,描述性查询被翻译成SQL查询语句。用SQL查询语句对已转化成为关系形式的常识知识库进行查询后,自然语言生成算法把以关系表为形式的查询结果转化成自然语言。实验表明,新的查询方法在查询时间上优于原有的直接在agent和本体上查询知识的旧方法。实验数据表明,查询时间缩短了大约80 %。  相似文献   

2.
Recent advances in semantic web have shown how entity related searches have benefited from entity-based knowledge graphs. However, much of the commonsense knowledge about the real world is in the form of procedures or sequences of actions. Also, search log analysis shows that ‘how-to queries’ make up a significant amount of users’ queries. Unfortunately, these kinds of knowledge are missing from most knowledge graphs and commonsense knowledge bases in use. To empower semantic search, and other intelligent applications, computers need a much broader understanding of the world properties of everyday objects, human activities, and more. Luckily, such knowledge is abundantly available on-line and can be accessed from how-to communities. One domain of interest by on-line communities is the health domain, whereby users usually seek home remedies to common health-related issues. An example of such queries might be ‘how to stop nausea using acupressure’ or ‘how to aid digestion naturally’. To answer such questions, we need systems that understand natural language and knowledge bases with task frames of solutions in a holistic approach, including the tools required, the agents involved, and the temporal order of the actions. Our goal is to construct a machine-readable domain targeted high precision procedural knowledge base containing task frames. We developed a pipeline of methods leveraging open information extraction tool to extract procedural knowledge by tapping into on-line communities. Also, we devised a mechanism to canonicalize the task frames into clusters based on the similarity of the problems they intend to solve. The resulting know-how knowledge base, HealthAidKB, consists of more than 71 K task frames which are structured hierarchically and categorically; and can be used in many applications such as semantic search, digital personal assistants, human-computer dialog and computer vision. A comprehensive evaluation of our knowledge base shows high accuracy.  相似文献   

3.
This paper describes the development and testing of a novel Automatic Search Query Enhancement (ASQE) algorithm, the Wikipedia N Sub-state Algorithm (WNSSA), which utilises Wikipedia as the sole data source for prior knowledge. This algorithm is built upon the concept of iterative states and sub-states, harnessing the power of Wikipedia’s data set and link information to identify and utilise reoccurring terms to aid term selection and weighting during enhancement. This algorithm is designed to prevent query drift by making callbacks to the user’s original search intent by persisting the original query between internal states with additional selected enhancement terms. The developed algorithm has shown to improve both short and long queries by providing a better understanding of the query and available data. The proposed algorithm was compared against five existing ASQE algorithms that utilise Wikipedia as the sole data source, showing an average Mean Average Precision (MAP) improvement of 0.273 over the tested existing ASQE algorithms.  相似文献   

4.
Nowadays, data scientists are capable of manipulating and extracting complex information from time series data, given the current diversity of tools at their disposal. However, the plethora of tools that target data exploration and pattern search may require an extensive amount of time to develop methods that correspond to the data scientist's reasoning, in order to solve their queries. The development of new methods, tightly related with the reasoning and visual analysis of time series data, is of great relevance to improving complexity and productivity of pattern and query search tasks. In this work, we propose a novel tool, capable of exploring time series data for pattern and query search tasks in a set of 3 symbolic steps: Pre-Processing, Symbolic Connotation and Search. The framework is called SSTS (Symbolic Search in Time Series) and uses regular expression queries to search the desired patterns in a symbolic representation of the signal. By adopting a set of symbolic methods, this approach has the purpose of increasing the expressiveness in solving standard pattern and query tasks, enabling the creation of queries more closely related to the reasoning and visual analysis of the signal. We demonstrate the tool's effectiveness by presenting 9 examples with several types of queries on time series. The SSTS queries were compared with standard code developed in Python, in terms of cognitive effort, vocabulary required, code length, volume, interpretation and difficulty metrics based on the Halstead complexity measures. The results demonstrate that this methodology is a valid approach and delivers a new abstraction layer on data analysis of time series.  相似文献   

5.
Query auto completion (QAC) models recommend possible queries to web search users when they start typing a query prefix. Most of today’s QAC models rank candidate queries by popularity (i.e., frequency), and in doing so they tend to follow a strict query matching policy when counting the queries. That is, they ignore the contributions from so-called homologous queries, queries with the same terms but ordered differently or queries that expand the original query. Importantly, homologous queries often express a remarkably similar search intent. Moreover, today’s QAC approaches often ignore semantically related terms. We argue that users are prone to combine semantically related terms when generating queries.We propose a learning to rank-based QAC approach, where, for the first time, features derived from homologous queries and semantically related terms are introduced. In particular, we consider: (i) the observed and predicted popularity of homologous queries for a query candidate; and (ii) the semantic relatedness of pairs of terms inside a query and pairs of queries inside a session. We quantify the improvement of the proposed new features using two large-scale real-world query logs and show that the mean reciprocal rank and the success rate can be improved by up to 9% over state-of-the-art QAC models.  相似文献   

6.
Automatic text summarization has been an active field of research for many years. Several approaches have been proposed, ranging from simple position and word-frequency methods, to learning and graph based algorithms. The advent of human-generated knowledge bases like Wikipedia offer a further possibility in text summarization – they can be used to understand the input text in terms of salient concepts from the knowledge base. In this paper, we study a novel approach that leverages Wikipedia in conjunction with graph-based ranking. Our approach is to first construct a bipartite sentence–concept graph, and then rank the input sentences using iterative updates on this graph. We consider several models for the bipartite graph, and derive convergence properties under each model. Then, we take up personalized and query-focused summarization, where the sentence ranks additionally depend on user interests and queries, respectively. Finally, we present a Wikipedia-based multi-document summarization algorithm. An important feature of the proposed algorithms is that they enable real-time incremental summarization – users can first view an initial summary, and then request additional content if interested. We evaluate the performance of our proposed summarizer using the ROUGE metric, and the results show that leveraging Wikipedia can significantly improve summary quality. We also present results from a user study, which suggests that using incremental summarization can help in better understanding news articles.  相似文献   

7.
邓茹仁  王鹏 《科技广场》2004,(11):11-12
XML文档的结构化连接问题是XML文档查询中的核心问题。XML文档的查询包括两类查询,一类是值的查询,一类是结构的查询。本文通过比较两种基于B 树和XR树的索引技术的XML文档结构化连接算法,说明基于XR树索引的结构化连接算法优于基于B 树索引的结构化连接算法。  相似文献   

8.
Most document clustering algorithms operate in a high dimensional bag-of-words space. The inherent presence of noise in such representation obviously degrades the performance of most of these approaches. In this paper we investigate an unsupervised dimensionality reduction technique for document clustering. This technique is based upon the assumption that terms co-occurring in the same context with the same frequencies are semantically related. On the basis of this assumption we first find term clusters using a classification version of the EM algorithm. Documents are then represented in the space of these term clusters and a multinomial mixture model (MM) is used to build document clusters. We empirically show on four document collections, Reuters-21578, Reuters RCV2-French, 20Newsgroups and WebKB, that this new text representation noticeably increases the performance of the MM model. By relating the proposed approach to the Probabilistic Latent Semantic Analysis (PLSA) model we further propose an extension of the latter in which an extra latent variable allows the model to co-cluster documents and terms simultaneously. We show on these four datasets that the proposed extended version of the PLSA model produces statistically significant improvements with respect to two clustering measures over all variants of the original PLSA and the MM models.  相似文献   

9.
李玉霞  李红宇 《科技通报》2012,28(2):149-151
Web日志中包含了大量的用户浏览信息,如何有效地从中挖掘出用户浏览模式就尤为重要了。本文在分析现有用户浏览模式挖掘算法存在问题的基础上,根据Web日志的特点,对关联规则挖掘算法进行改进,提出了基于滑动窗口的浏览模式挖掘算法TBPM。并在此算法基础上设计了增量更新算法,对实际数据的实验结果验证了本算法的有效性。  相似文献   

10.
Query suggestion is generally an integrated part of web search engines. In this study, we first redefine and reduce the query suggestion problem as “comparison of queries”. We then propose a general modular framework for query suggestion algorithm development. We also develop new query suggestion algorithms which are used in our proposed framework, exploiting query, session and user features. As a case study, we use query logs of a real educational search engine that targets K-12 students in Turkey. We also exploit educational features (course, grade) in our query suggestion algorithms. We test our framework and algorithms over a set of queries by an experiment and demonstrate a 66–90% statistically significant increase in relevance of query suggestions compared to a baseline method.  相似文献   

11.
The work described here initially formed part of a triangulation exercise to establish the effectiveness of the Query Term Order algorithm. It subsequently proved to be a reliable indicator for summarising English web documents. We utilised the human summaries from the Document Understanding Conference data, and generated queries automatically for testing the QTO algorithm. Six sentence weighting schemes that made use of Query Term Frequency and QTO were constructed to produce system summaries, and this paper explains the process of combining and balancing the weighting components. The summaries produced were evaluated by the ROUGE-1 metric, and the results showed that using QTO in a weighting combination resulted in the best performance. We also found that using a combination of more weighting components always produced improved performance compared to any single weighting component.  相似文献   

12.
This study examines the post-M&A innovative performance of acquiring firms in four major high-tech sectors. Non-technological M&As appear to have a negative impact on the acquiring firm's post-M&A innovative performance. With respect to technological M&As, a large relative size of the acquired knowledge base reduces the innovative performance of the acquiring firm. The absolute size of the acquired knowledge base only has a positive effect during the first couple of years after which the effect turns around and we see a negative effect on the innovative performance of the acquiring firm. The relatedness between the acquired and acquiring firms’ knowledge bases has a curvilinear impact on the acquiring firm's innovative performance. This indicates that companies should target M&A ‘partners’ that are neither too unrelated nor too similar in terms of their knowledge base.  相似文献   

13.
Word sense ambiguity has been identified as a cause of poor precision in information retrieval (IR) systems. Word sense disambiguation and discrimination methods have been defined to help systems choose which documents should be retrieved in relation to an ambiguous query. However, the only approaches that show a genuine benefit for word sense discrimination or disambiguation in IR are generally supervised ones. In this paper we propose a new unsupervised method that uses word sense discrimination in IR. The method we develop is based on spectral clustering and reorders an initially retrieved document list by boosting documents that are semantically similar to the target query. For several TREC ad hoc collections we show that our method is useful in the case of queries which contain ambiguous terms. We are interested in improving the level of precision after 5, 10 and 30 retrieved documents (P@5, P@10, P@30) respectively. We show that precision can be improved by 8% above current state-of-the-art baselines. We also focus on poor performing queries.  相似文献   

14.
中外合资企业管理控制与知识获取关系的实证研究   总被引:1,自引:0,他引:1       下载免费PDF全文
黎常 《科学学研究》2008,26(6):1267-1275
 摘要:合资企业是发展中国家企业组织学习和知识获取的重要途径。本研究运用国际合资企业管理控制和组织学习理论,探讨了中方对合资企业的管理控制如何影响不同类型知识的获取,并分析了学习能力和合作信任对于管理控制与知识获取之间的调节作用。本研究发现中方对合资企业的运营控制程度降低了中方的内隐知识和外显知识获取水平;在均等股权的合资企业中,中方知识获取的水平最高;合作信任对于管理控制与知识获取之间关系存在调节作用。  相似文献   

15.
Noetica is a tool for structuring knowledge about concepts and the relationships between them. It differs from typical information systems in that the knowledge it represents is abstract, highly connected and includes meta-knowledge (knowledge about knowledge). Noetica represents knowledge using a strongly-typed semantic network. By providing a rich type system it is possible to represent conceptual information using formalised structures. A class hierarchy provides a basic classification for all objects. This allows for a consistency of representation that is not often found in “free” semantic networks and gives the ability to easily extend a knowledge model while retaining its semantics. We also provide visualisation and query tools for this data model. Visualisation can be used to explore complete sets of link-classes, show paths while navigating through the database, or visualise the results of queries. Noetica supports goal-directed queries (a series of user-supplied goals that the system attempts to satisfy in sequence) and path-finding queries (where the system find relationships between objects in the database by following links).  相似文献   

16.
Search sessions consist of a person presenting a query to a search engine, followed by that person examining the search results, selecting some of those search results for further review, possibly following some series of hyperlinks, and perhaps backtracking to previously viewed pages in the session. The series of pages selected for viewing in a search session, sometimes called the click data, is intuitively a source of relevance feedback information to the search engine. We are interested in how that relevance feedback can be used to improve the search results quality for all users, not just the current user. For example, the search engine could learn which documents are frequently visited when certain search queries are given.  相似文献   

17.
We investigated the searching behaviors of twenty-four children in grades 6, 7, and 8 (ages 11–13) in finding information on three types of search tasks in Google. Children conducted 72 search sessions and issued 150 queries. Children's phrase- and question-like queries combined were much more prevalent than keyword queries (70% vs. 30%, respectively). Fifty two percent of the queries were reformulations (33 sessions). We classified children's query reformulation types into five classes based on the taxonomy by Liu et al. (2010). We found that most query reformulations were by Substitution and Specialization, and that children hardly repeated queries. We categorized children's queries by task facets and examined the way they expressed these facets in their query formulations and reformulations. Oldest children tended to target the general topic of search tasks in their queries most frequently, whereas younger children expressed one of the two facets more often. We assessed children's achieved task outcomes using the search task outcomes measure we developed. Children were mostly more successful on the fact-finding and fully self-generated task and partially successful on the research-oriented task. Query type, reformulation type, achieved task outcomes, and expressing task facets varied by task type and grade level. There was no significant effect of query length in words or of the number of queries issued on search task outcomes. The study findings have implications for human intervention, digital literacy, search task literacy, as well as for system intervention to support children's query formulation and reformulation during interaction with Google.  相似文献   

18.
激烈的竞争环境推动企业实施开放式创新战略突破组织边界,通过获取丰富的外部知识来弥补自身不足以促进企业成长,但学界对创新开放度、企业成长与知识获取三者之间关系的理论分析和实证研究不足,为此,以国内316家高新技术企业为样本,运用结构方程模型研究验证三者之间的关系。结果表明:创新开放广度与深度对高新技术企业成长的影响具有差异性,不同类型知识获取的中介作用也有所不同。其中,创新开放广度对企业成长没有显著影响,而创新开放深度对企业成长具有正向影响;创新开放广度对显性知识获取没有显著影响、对隐性知识获取具有正向影响,而创新开放深度对显性知识获取与隐性知识获取均有正向影响;显性知识获取对企业成长的影响并不显著,而隐性知识获取在创新开放广度与企业成长之间具有完全中介作用、在创新开放深度对企业成长的正向影响中起部分中介作用。从而得到如下启示:企业要重视开放式创新战略,合理高效利用开放式创新网络;不断提高创新开放深度,有效控制创新开放广度;重视从多种途径获取知识,尤其是隐性知识。  相似文献   

19.
As a significant source of knowledge, virtual communities have stimulated interest in knowledge management research. Nonetheless, very few studies to date have examined the demand-side knowledge perspective such as knowledge acquisition in virtual communities. In order to explore the knowledge acquisition process within virtual communities, this study proposes the cognitive selection framework of knowledge acquisition strategy in virtual communities. The proposed framework takes a cognitive perspective, to identify how knowledge recipients select their strategy for acquiring specialized knowledge, emphasizing their cognitive goals (e.g., cognitive replication and innovation) and cognitive motivators (e.g., virtual community self-efficacy, heightened enjoyment, and time resources). Our results suggest that knowledge recipients’ cognitive motivators differentially influence their cognitive goals (cognitive replication and innovation), which, in turn, are related to their selection of knowledge acquisition strategy (static and dynamic acquisition strategy), respectively.  相似文献   

20.
In this paper, we define and present a comprehensive classification of user intent for Web searching. The classification consists of three hierarchical levels of informational, navigational, and transactional intent. After deriving attributes of each, we then developed a software application that automatically classified queries using a Web search engine log of over a million and a half queries submitted by several hundred thousand users. Our findings show that more than 80% of Web queries are informational in nature, with about 10% each being navigational and transactional. In order to validate the accuracy of our algorithm, we manually coded 400 queries and compared the results from this manual classification to the results determined by the automated method. This comparison showed that the automatic classification has an accuracy of 74%. Of the remaining 25% of the queries, the user intent is vague or multi-faceted, pointing to the need for probabilistic classification. We discuss how search engines can use knowledge of user intent to provide more targeted and relevant results in Web searching.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号