首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
元数据的应用需要开发适于所应用主题领域的规范词表来满足用户的检索需求,但目前对用户用什么词来进行查找却知之甚少。为了了解数字化教育图书馆用户在检索中使用什么样的词来进行查找,本文作者利用检索记录挖掘的方法来进行研究。在初步分析了40多万条检索记录中所含的100多万个检索词之后,作者重点分析了规范词在检索中被用户使用的情况,并且对比分析了哪些非规范词被用户使用.作者发现用户在查找信息的过程中对规范词的使用频率大大超过了非规范词的使用频率。对非规范词使用的进一步分析不仅可以提供补充更新规范词的来源,而且也可以为分析规范词非规范词之间建立对应的浯义关系提供重要的信息来源。  相似文献   

2.
Ensuring quick and consistent access to large collections of unstructured documents is one of the biggest challenges facing knowledge-intensive organizations. Designing specific vocabularies to index and retrieve documents is often deemed too expensive, full-text search being preferred despite its known limitations. However, the process of creating controlled vocabularies can be partly automated thanks to natural language processing and machine learning techniques. With a case study from the biopharmaceutical industry, we demonstrate how small organizations can use an automated workflow in order to create a controlled vocabulary to index unstructured documents in a semantically meaningful way.  相似文献   

3.
This study examined the characteristics of users' free-text queries submitted to RILM Abstracts of Music Literature (a music literature database), and compared those queries with the controlled vocabularies used by RILM. Search-log analysis identified 11 categories of user-created search terms, and mapped each user-created search term to RILM's index terms, assessing whether it was a perfect match, a partial match, or no match. Only 30.04% of the user-created search terms did not match RILM's index terms. Most of the partial-matching and non-matching user-created search terms were personal names, work titles, and topical terms. Suggestions are offered to enhance RILM's controlled vocabularies.  相似文献   

4.
This paper provides an overview of the research into current medical vocabularies and their impact on searching the Web for health information. The Web provides growing opportunities for laypersons to gain knowledge about specific health conditions, though research to date has been incomplete. Many studies have examined aspects of controlled medical vocabularies. Other studies have examined aspects of medical Web searching vocabularies. In this context, there is a growing need to examine more closely laypersons' Web queries using controlled medical vocabularies that were designed to serve the needs of medical professionals. It may be the case that the average consumer of Web health services is not able to use correct medical terminology, and may not be able to choose analogous or synonymous terms from a search result list. Our review suggests a growing need for studies to examine the current applicability of controlled medical vocabularies as well as alternatives to semantic query by Web search engine users.  相似文献   

5.
There have been ample suggestions in the literature that terms added to documents from Flickr and Wikipedia can complement traditional methods of indexing and controlled vocabularies. At the same time, adding new metadata to existing metadata objects may not always add value to those objects. The potential added-value of using user-contributed (“social”) terms from Flickr and the English Wikipedia in image indexing is compared with using two expert-created controlled vocabularies—the Thesaurus for Graphic Materials and the Library of Congress Subject Headings—without those social terms. Experiments confirmed that the social terms did add value, relative to terms from the controlled vocabularies. The median rating for the usefulness of social terms was significantly higher than the baseline rating, but was lower than the ratings for the terms from the Thesaurus for Graphic Materials and the Library of Congress Subject Headings. Furthermore, complementing the controlled vocabulary terms with social terms more than doubled the average coverage of participants' terms for a photograph. The relationships between user demographics and users' perceptions of the value of terms were also investigated, as well as the relationships between user demographics and indexing quality, as measured by the number of terms participants assigned to a photograph. Participants with more tagging and indexing experience assigned a greater number of tags than did other participants.  相似文献   

6.
[目的/意义] 对生命科学领域的科研数据仓储进行调研与分析,探讨生命科学领域的科研数据管理服务。[方法/过程] 利用re3data.org开放数据仓储目录与注册系统,分析生命科学领域科研数据仓储的建设年代、国家、机构、学科领域、开放程度等分布情况,并选取Genbank、Dryad、ArrayExpress、Purdue University Research Repository、Biosharing和dbGaP 6个典型的数据仓储,从数据获取、重用、存储等方面深度分析其服务内容和模式。[结果/结论] 美英两国引领着生命科学领域科研数据仓储的建设与共享,在国家层面和资助机构层面均制定了科研数据相关政策;国内可借鉴美英两国成熟的建设经验,加快制定战略规划和政策体系;资助机构应发挥引导作用,在服务内容及模式上推动数据管理与共享,建设具有领域特色的高影响力的数据仓储并集成数据管理服务。  相似文献   

7.
《Communication monographs》2012,79(2):176-198
This article examines connections between communication and identity. We present an analysis of actual, recorded social interactions in order to describe intersections between identity and vocabulary selection. We focus on how, in selecting or deselecting particular terms (e.g., cephalic, doula, cooker) speakers can display both their own identities and the identities of others. We show how these identities are constructed in part through speakers' selection and competent deployment of the specialist vocabularies associated with particular territories of expertise, how identities can be challenged when cointeractants presume understanding problems with specialist vocabularies, and how they can be defended (more or less vigorously) against such challenges with claims or displays of understanding. This conversation analytic approach to talk-in-interaction documents how specialist vocabularies can be deployed, in situ, in the construction of social identities. In describing how communication is used in the enactment and construction of identity, our findings contribute to the developing body of research specifying communication practices through which identity is constructed and showing how salient identities are made manifest in interaction.  相似文献   

8.
Perhaps the greatest power of folksonomies, especially when set against controlled vocabularies like the Library of Congress Subject Headings, lies in their capacity to empower user communities to name their own resources in their own terms. This article analyzes the potential and limitations of both folksonomies and controlled vocabularies for transgender materials by analyzing the subject headings in WorldCat records and the user-generated tags in LibraryThing for books with transgender themes. A close examination of the subject headings and tags for twenty books on transgender topics reveals a disconnect between the language used by people who own these books and the terms authorized by the Library of Congress and assigned by catalogers to describe and organize transgender-themed books. The terms most commonly assigned by users are far less common or non-existent in WorldCat. The folksonomies also provide spaces for a multiplicity of representations, including a range of gender expressions, whereas these entities are often absent from Library of Congress Subject Headings and WorldCat. While folksonomies are democratic and respond quickly to shifts and expansions of categories, they lack control and may inhibit findability of resources. Neither tags nor subject headings are perfect systems by themselves, but they may complement each other well in library catalogs. Bringing users’ voices into catalogs through the addition of tags might greatly enhance organization, representation, and retrieval of transgender-themed materials.  相似文献   

9.
Folksonomy与受控词汇在OPAC的应用研究   总被引:1,自引:0,他引:1  
主题表、叙词表等传统受控词汇形式的信息组织工具在Web2.0环境下凸显出缺陷,影响联机公共目录查询系统的检索质量。论文对分众分类法和受控词汇的优缺点进行分析,认为两者可以很好地互补,并且提出了一个应用在联机公共目录查询系统的可行性模式。  相似文献   

10.
机构作为一种命名实体,在数字人文数据基础设施构建中有着重要的作用,设计一套灵活可扩展的机构本体模型和词表是不可回避的问题。国内在机构本体方面的研究比较有限,对于如何构建一套可扩展、可复用的本体模型,未见系统性的构建方法和可支持实际应用的词表。国外机构本体研究虽已成果颇丰,但现有的通用机构本体模型在支持机构之间复杂的关系揭示、机构的历史沿革描述等方面还不够深入,难以应对现实世界中的复杂性。针对上述问题,在现有国内外机构本体研究和技术发展的基础上,借鉴领域知识本体的构建方法,文章试图设计一个在万维网上通用的、支持复用并考虑未来扩展途径的机构知识本体模型和词表,并以其在上海图书馆数字人文数据基础设施中的应用和实践来验证其可行性。  相似文献   

11.
文章分享作者参加都柏林核心元数据组织(Dublin Core Metadata Initiative,DCMI)2013年国际会议的几点体会,重点在于采用规范数据支持数据实时混搭的两种不同方式、元数据属性映射的两种不同水平和表现方法,以及数字图书馆数据模型的三种不同实现方案。  相似文献   

12.
调查了Taxonomy Warehouse中医学受控词表的数量、规模、编制单位、学科分布、语种、应用等情况,介绍了UMLS、MeSH及ICD等主流医学词表的发展概况,认为语义网环境下,国外医学受控词表呈现出本体转化、智能更新与应用、用户协同编制、动态集成与分解、发布为关联数据等趋势。  相似文献   

13.
This study concerns the overlap between author-supplied keywords and Library of Congress Subject Headings (LCSH) in Electronic Thesis and Dissertation (ETD) bibliographic records in the library catalog. The article provides a discussion on uniqueness, matching, and complementariness based on a replication of Strader's methodology and rubric from a 2009 article. Findings support most of Strader's conclusions, including the complementary nature of keywords and controlled vocabularies. Both keywords and LCSH provide unique terms that enhance access. Researchers also broke new ground regarding partial matching, particularly within LCSH. The fact that uniqueness matters has implications for the continued use of LCSH, for LCSH maintenance, and for further research.  相似文献   

14.
[目的/意义]调研分析化学领域科研数据知识库现状,让科研人员了解化学领域数据知识库的概况,为其选择合适的数据知识库发表科研数据提供参考;从图书馆为科研人员提供服务的角度,探索基于化学领域科研数据发表的数据服务.[方法/过程]通过re3data.org、Databib、OAD 3个数据知识库的注册和目录系统,调研化学领域科研数据知识库的创建国家、建立时间、知识库的类别、存储数据的化学领域以及知识库的开放情况,对专门针对化学学科数据建立的知识库服务特点进行总结,并选取3个典型的数据知识库--Cambridge Structural Database、ChemSpider和ChemSynthesis,深度分析化学数据知识库的服务内容.[结果/结论]化学领域科研数据知识库的数量较多,建设国家较为集中,学科领域分布较为广泛,开放程度不尽相同.  相似文献   

15.
[目的/意义]介绍Make Data Count与COUNTER联合推出的《研究数据使用统计实施规范》,为数据级别计量提供新指标与新视角。[方法/过程]通过对标准文本的分析,介绍该规范的提出背景、目标、范围、相关概念及核心内容,通过案例剖析Dash、DataONE、Zenodo及其他7个数据存储库对《规范》的应用情况。[结果/结论]研究数据的使用统计具有其独特之处,《规范》的推出可对数据引用及数据替代计量形成补充进而描述完整的科研学术影响力。目前遵循该规范的数据存储库还不多,为推动数据使用计量的应用,需要标准组织、科研人员、机构库及数据存储库、出版商、科研机构及资助机构、图书馆等不同利益相关者在数据产生、管理、传播与利用等环节的相互合作。  相似文献   

16.
本文结合中国科学引文数据库和中国生态系统研究网络通量数据的关联数据发布,以关联数据的发布技术框架为研究对象,采取实例阐释的方法,提出了关联数据发布过程中可参考的标准化流程,并详细分析了其中的关键问题.研究表明,关联数据发布流程可以分解成数据建模、实体命名、实体RDF化、实体关联化、实体发布、开放查询六个关键步骤,发布过程中需要考虑到多语种问题、值词表的发布、RDF词表的发布等关键问题.关于利用D2R Server发布数据,本文建议:不要采用空白节点;尽量做好关系型数据库的前期设计;指定非文本属性的数据类型;适当进行实体表的拆分与合并.  相似文献   

17.
受控词表的术语服务研究   总被引:2,自引:0,他引:2  
范炜 《图书情报工作》2012,56(14):34-39,97
明确受控词表在词汇控制和语义关联两方面的作用,阐释基于受控词表的术语服务内涵和意义,提出术语服务生命周期,抽象出术语服务的三层架构,在此基础上进行Web service设计,给出基础API服务集。最后对术语服务研究应用的一些重要问题进行相关讨论。  相似文献   

18.
The aim of this study is to explore the phenomenon of research software citation and, in particular, to draw attention to the increasing importance of this form of citation in scholarly communication. This research sheds light on the current status of formal software citation that is captured by citation databases. Data for the study were gathered from more than 67,000 research software records available in public repositories indexed by Clarivate Analytics’ Data Citation Index (DCI). The metadata characteristics of the indexed records and citation data were then analyzed. Research software was rarely cited in the DCI, suggesting that the documented reuse of research software rarely occurs or is not well documented. Institutional repositories attracted few citations and had low rate of citation. It proved impossible, however, using the available data to isolate specific identifiers that can promote formal software citation. The findings presented here offer insights into research software citation that will be of interest to funding agencies, publishers, researchers, and research organizations.  相似文献   

19.
本文以17个论文高产出国家(地区)论文数及被引用次数的数据与ROAR、Open DOAR和RWR三个知识库网站用户注册的知识库数量为依据,得出论文数、论文被引次数与知识库数量之间存在显著相关。并在此基础上,对在RWR网站中17个国家(地区)的知识库排名区间分布进行了统计。研究表明,我国知识库建设,不仅在数量上有较大差距,而且在质量上也处于劣势,提出既要重视知识库理论研究,也要重视知识库实际建设,同时应充分认识知识库开放存取对论文被引的积极作用。  相似文献   

20.
基于控制词集的中文信息动态自动聚类研究   总被引:1,自引:0,他引:1  
以专用词典为切分工具,建立以概念为基础的、具有主题分类特点的类目结构是中文信息动态自动聚类的一种适用方式。该文探索了基于控制词集的中文信息动态自动聚类技术,包括专用控制词集的构建,动态有限环境下的自动聚类程序、聚类算法,以及结合控制词集对聚类结果进行优化控制等,最后对实验结果给出了概要评价。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号