首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Documents in computer-readable form can be used to provide information about other documents, i.e. those they cite.To do this efficiently requires procedures for computer recognition of citing statements. This is not easy, especially for multi-sentence citing statements. Computer recognition procedures have been developed which are accurate to the following extent: 73% of the words in statements selected by computer procedures as being citing statements are words which are correctly attributable to the corresponding documents.The retrieval effectiveness of computer-recognized citing statements was tested in the following way. First, for eight retrieval requests in inorganiic chemistry, average recall by search of Chemical Abstracts Service indexing and Chemical Abstracts abstract text words was found to be 50%. Words from citing statements referring to the papers to be retrieved were then added to the index terms and abstract words as additional access points, and searching was repeated. Average recall increased to 70%. Only words from citing statements published within a year of the cited papers were used.The retrieval effect of citing statement words alone (published within a year) without index or abstract terms was the following: average recall was 40%. When just the words of the titles of the cited papers were added to those citing statement words, average recall increased to 50%.  相似文献   

2.
In a preceding experiment in text-searching retrieval for cancer questions, search words were humanly selected with the aid of a medical dictionary and cancer textbooks. Recall results were (1) using only stems of question words (humanly stemmed): 20%; (2) adding dictionary search words: 29%; (3) adding also textbook search words: 70%. For the experiment reported here, computer procedures for using the medical dictionary to select search words were developed. Recall results were (1) for question stems (computer stemmed): 19%; (2) adding search words computer selected from the dictionary: 24 %. Thus the computer procedures compared to human use of the dictionary were 50% successful. Human and computer false retrieval rates were almost equal. Some hypotheses about computer selection of search words from textbooks are also described.  相似文献   

3.
Passage retrieval (already operational for lawyers) has advantages in output form over reference retrieval and is economically feasible. Previous experiments in passage retrieval for scientists have demonstrated recall and false retrieval rates as good or better than those of present reference retrieval services. The present experiment involved a greater variety of forms of retrieval question. In addition, search words were selected independently by two different people for each retrieval question. The search words selected, in combination with the computer procedures used for passage retrieval, produced average recall ratios of 72 and 67%, respectively, for the two selectors. The false retrieval rates were (except for one predictably difficult question) respectively 13 and 10 falsely retrieved sentences per answer-paper retrieved.  相似文献   

4.
An expert system was developed in the area of information retrieval, with the objective of performing the job of an information specialist, who assists users in selecting the right vocabulary terms for a database search.The system is composed of two components: One is the knowledge base, represented as a semantic network, in which the nodes are words, concepts, phrases, comprising a vocabulary of the application area and the links express semantic relationships between those nodes. The second component is the rules, or procedures, which operate upon the knowledge-base, analogous to the decision rules or work patterns of the information specialist.Two major stages comprise the consulting process of the system: During the “search” stage relevant knowledge in the semantic network is activated, and search and evaluation rules are applied in order to find appropriate vocabulary terms to represent the user's problem. During the “suggest” stage those terms are further evaluated, dynamically rank-ordered according to relevancy, and suggested to the user. Explanations to the findings can be provided by the system and backtracking is possible in order to find alternatives in case some suggested term is rejected by the user.This article presents the principle, procedures and rules which are utilized in the expert system.  相似文献   

5.
Bibliometric maps of field of science   总被引:2,自引:1,他引:2  
The present paper is devoted to two directions in algorithmic classificatory procedures: the journal co-citation analysis as an example of citation networks and lexical analysis of keywords in the titles and texts. What is common to those approaches is the general idea of normalization of deviations of the observed data from the mathematical expectation. The application of the same formula leads to discovery of statistically significant links between objects (journals in one case, keywords — in the other). The results of the journal co-citation analysis are reflected in tables and map for field “Women’s Studies” and for field “Information Science and Library Science”. An experimental attempt at establishing textual links between words was carried out on two samples from SSCI Data base: (1) EDUCATION and (2) ETHICS. The EDUCATION file included 2180 documents (of which 751 had abstracts); the ETHICS file included 807 documents (289 abstracts). Some examples of the results of this pilot study are given in tabular form . The binary links between words discovered in this way may form triplets or other groups with more than two member words.  相似文献   

6.
7.
谢桂苹  刘斌 《现代情报》2012,32(9):151-154
CSCD作为国内最具权威性的引文信息源,其引文检索的满意度直接影响了各种科学评价的结果。本文结合实例,就如何提高CSCD引文检索的查全率,从来源文献的完善、检索词的选取及检索词间的逻辑组配等多种角度,提出了探讨性意见。  相似文献   

8.
An indexing technique for text data based on word fragments is described. In contrast to earlier approaches the fragments are allowed to be overlapping and are linked in a directed graph structure reflecting that many fragments (“Superstrings”) contain other fragments as substrings. This leads to a redundant free set of primary data pointers. By classifying the set of Superstrings belonging to a fragment according to the position of the fragment in the Superstring, one gains a novel possibility of supporting exact match-, partial match-, and masked partial match-retrieval by an index. The search strategies for the various retrieval cases are described.  相似文献   

9.
基于ISI检索平台的SCI引文检索补遗   总被引:2,自引:0,他引:2  
吕淑萍  谢桂平 《情报科学》2007,25(11):1655-1658
SCI作为国际上最具权威性的引文信息源,其引文检索的满意度直接影响了各种科学评价的结果。本文结合实例,就如何提高基于ISI检索平台的SCI引文检索的查全率,从来源文献的完善、检索词的选取及检索词间的逻辑组配等多种角度,提出了探讨性意见。  相似文献   

10.
胡泽文  刘硕  冯睿  张小菜 《现代情报》2018,38(11):95-104
基于Web of Science数据库,以中美英图书情报学领域1990-1994年和2010-2014年期间文献的施引文献为样本,计量分析了:1)美国在1990-1994年和2010-2014年期间的施引文献特征:国别、机构、来源期刊、开源、类型和语言及其之间的差异;2)中美英2010-2014年期间图书情报学领域的施引文献特征及其之间的差异,揭示出中美英3国1990-1994年和2010-2014年期间图书情报学领域的施引特征分布情况及科学交流概况。研究发现:1)1990-2015年美国的年均发文数量是中国年均发文数量的24倍;2)中美英图书情报学领域的主要施引国家基本一致;3)美英图书情报学领域新时期(2010-2014)的主要施引机构中,出现香港城市大学、中国科学院大学和武汉大学等中国大学的身影;4)美国图书情报学领域施引文献的开源程度最高,比例达到10.73%;5)中美英3国图书情报学领域的主要施引文献类型基本一致,全部为论文、会议论文、综述、专著和社论材料,且论文和会议论文的占比均在91%以上。  相似文献   

11.
The research examines the notion that the principles underlying the procedure used by doctors to diagnose a patient's disease are useful in the design of “intelligent” IR systems because the task of the doctor is conceptually similar to the computer (or human) intermediary's task in “intelligent information retrieval”: to draw out, through interaction with the IR system, the user's query/information need. The research is reported in two parts. In Part II, an information retrieval tool is described which is based on “intelligent information retrieval” assumptions about the information user. In Part I, presented here, the theoretical framework for the tool is set out. This framework is borrowed from the diagnostic procedure currently used in medicine, called “differential diagnosis”. Because of the severe consequences that attend misdiagnosis, the operating principle in differential diagnosis is (1) to expand the uncertainty in the diagnosis situation so that all possible hypotheses and evidence are considered, then (2) to contract the uncertainty in a step by step fashion (from an examination of the patient's symptoms, through the patient's history and a physical (signs), to laboratory tests). The IR theories of Taylor, Kuhlthau and Belkin are used to demonstrate that these medical diagnosis procedures are already present in IR and that it is a viable model with which to design “intelligent” IR tools and systems.  相似文献   

12.
FACTS is an APL-based interactive on-line system used for retrieval of budget and accounting data. The system provides selective retrieval and manipulation of financial data for management in a development laboratory. The terms “teilnehmer” and “teilhaber” are defined and it is argued that use of a teilnehmer system, such as APL, can considerably reduce the programming and monitary investment for information science systems applications. A brief discussion of APL's text editing facilities is also included to introduce this relatively unknown language to information scientists.  相似文献   

13.
Relevance of bibliometric indicators on scientific areas critically depends on the quality of their delineation. Macro-level studies, often based on a selected list of journals, accept a high degree of fuzziness. Micro-level studies rely on sets of individual articles in order to reduce noise and enhance precision of retrieval. The most usual information retrieval process is based on lexical queries with various levels of sophistication. In the experiment on Nanosciences reported here, this process was used as a first step, to delineate a ‘seed’ of literature. It has strong limitations, especially for emerging or transversal fields. In a second step, the alternative approach of citation linkages, was used to expand the bibliography starting from lexical seed. The extension process presented is ruled by three parameters, two deal with the cited side (threshold on citation score, and specificity towards the field), one with the citing side (threshold on the number of relevant references) interplaying in the ‘referencing structure’ function (RSF) introduced in a previous work. This type of combination proves effective for delineating the transversal field of Nanosciences. Further improvements of the method are discussed.  相似文献   

14.
2000-2006年我国基于本体的信息检索研究论文定量分析   总被引:7,自引:0,他引:7  
以中国期刊网为情报源,以关键词为检索方式,采用定量分析的方法,通过对2000年—2006年我国基于本体的信息检索研究论文的时间分布、期刊分布、地区分布、作者、研究内容和基金资助等各个方面进行统计分析,确定该课题研究的核心期刊、核心机构,探讨我国基于本体的信息检索的研究现状和存在的问题。  相似文献   

15.
Traditional Cranfield test collections represent an abstraction of a retrieval task that Sparck Jones calls the “core competency” of retrieval: a task that is necessary, but not sufficient, for user retrieval tasks. The abstraction facilitates research by controlling for (some) sources of variability, thus increasing the power of experiments that compare system effectiveness while reducing their cost. However, even within the highly-abstracted case of the Cranfield paradigm, meta-analysis demonstrates that the user/topic effect is greater than the system effect, so experiments must include a relatively large number of topics to distinguish systems’ effectiveness. The evidence further suggests that changing the abstraction slightly to include just a bit more characterization of the user will result in a dramatic loss of power or increase in cost of retrieval experiments. Defining a new, feasible abstraction for supporting adaptive IR research will require winnowing the list of all possible factors that can affect retrieval behavior to a minimum number of essential factors.  相似文献   

16.
易漏检词在不同数据库中的检索技巧   总被引:2,自引:0,他引:2  
李小平  付开远  马佳 《情报科学》2007,25(2):246-248
本文利用国内外的医学权威数据库,将医学文献中带“—”和希腊字母的易漏检检索词,通过检索实例比较,探讨如何提高查全率和查准率,以让更多的科研人员掌握其检索技巧,从而更好地为科研服务。  相似文献   

17.
Researchers in indexing and retrieval systems have been advocating the inclusion of more contextual information to improve results. The proliferation of full-text databases and advances in computer storage capacity have made it possible to carry out text analysis by means of linguistic and extra-linguistic knowledge. Since the mid 80s, research has tended to pay more attention to context, giving discourse analysis a more central role. The research presented in this paper aims to check whether discourse variables have an impact on modern information retrieval and classification algorithms. In order to evaluate this hypothesis, a functional framework for information analysis in an automated environment has been proposed, where the n-grams (filtering) and the k-means and Chen’s classification algorithms have been tested against sub-collections of documents based on the following discourse variables: “Genre”, “Register”, “Domain terminology”, and “Document structure”. The results obtained with the algorithms for the different sub-collections were compared to the MeSH information structure. These demonstrate that n-grams does not appear to have a clear dependence on discourse variables, though the k-means classification algorithm does, but only on domain terminology and document structure, and finally Chen’s algorithm has a clear dependence on all of the discourse variables. This information could be used to design better classification algorithms, where discourse variables should be taken into account. Other minor conclusions drawn from these results are also presented.  相似文献   

18.
This work addresses the information retrieval problem of auto-indexing Arabic documents. Auto-indexing a text document refers to automatically extracting words that are suitable for building an index for the document. In this paper, we propose an auto-indexing method for Arabic text documents. This method is mainly based on morphological analysis and on a technique for assigning weights to words. The morphological analysis uses a number of grammatical rules to extract stem words that become candidate index words. The weight assignment technique computes weights for these words relative to the container document. The weight is based on how spread is the word in a document and not only on its rate of occurrence. The candidate index words are then sorted in descending order by weight so that information retrievers can select the more important index words. We empirically verify the usefulness of our method using several examples. For these examples, we obtained an average recall of 46% and an average precision of 64%.  相似文献   

19.

Introduction

Poor harmonization of critical results management is present in various laboratories and countries, including Croatia. We aimed to investigate procedures used in critical results reporting in Croatian medical biochemistry laboratories (MBLs).

Materials and methods

An anonymous questionnaire, consisting of 24 questions/statements, related to critical results reporting procedures, was send to managers of MBLs in Croatia. Participants were asked to declare the frequency of performing procedures and degree of agreement with statements about critical values reporting using a Likert scale. Total score and mean scores for corresponding separate statements divided according to health care setting were calculated and compared.

Results

Responses from 111 Croatian laboratories (48%) were analyzed. General practice laboratories (GPLs) more often re-analyzed the sample before reporting the critical result in comparison with the hospital laboratories (HLs) (score: 4.86 (4.75-4.96) vs. 4.49 (4.25-4.72); P = 0.001) and more often reported the critical value exclusively to the responsible physician compared to HLs (4.46 (4.29-4.64) vs. 3.76 (3.48-4.03), P < 0.001). High total score (4.69 (4.56-4.82)) was observed for selection of the critical results list issued by the Croatian Chamber of Medical Biochemistry (CCMB) indicating a high harmonization level for this aspect of critical result management. Low total scores were observed for the statements regarding data recording and documentation of critical result notification.

Conclusions

Differences in practices about critical results reporting between HLs and GPLs were found. The homogeneity of least favorable responses detected for data recording and documentation of critical results notification reflects the lack of specific national recommendations.Key words: critical results, laboratory testing, quality indicators, survey, post-analytical phase  相似文献   

20.
One of the best known measures of information retrieval (IR) performance is the F-score, the harmonic mean of precision and recall. In this article we show that the curve of the F-score as a function of the number of retrieved items is always of the same shape: a fast concave increase to a maximum, followed by a slow decrease. In other words, there exists a single maximum, referred to as the tipping point, where the retrieval situation is ‘ideal’ in terms of the F-score. The tipping point thus indicates the optimal number of items to be retrieved, with more or less items resulting in a lower F-score. This empirical result is found in IR and link prediction experiments and can be partially explained theoretically, expanding on earlier results by Egghe. We discuss the implications and argue that, when comparing F-scores, one should compare the F-score curves’ tipping points.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号