首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
提出一种基于潜在语义索引和本体论的文本语义处理方法。首先构建一个基于本体论的虚拟标准文本特征向量,然后采用潜在语义索引方法以虚拟标准文本特征向量为参照对文本集进行语义聚类,最后在虚拟标准文本特征向量的导引下利用本体库中的知识对聚类获得的文本集合的类别和语义进行显性标注。实验表明,该方法能较好地在语义层面对文本进行有效的聚类,而且聚类结果能显性地显示类聚所属的类别。  相似文献   

2.
蔡宇宏 《编辑学报》2002,14(5):352-354
西方语言对我国学术期刊论文标引语言的影响,是在全球化趋势下大量以英语文化为主流的西方文化与汉语语言文化相互接触与碰撞的过程中出现的,是情报语言为顺应历史发展的潮流,满足文献信息资源传播与交流的数字化、网络化、国际化,实现其自身功能的必然选择.  相似文献   

3.
ABSTRACT

This study is one of the first to compare journalistic role performances of English– and Spanish–language TV networks during the 2016 U.S. primaries. Previous research finds that the corporate structure of Spanish–language media in the United States is looking more like its English–language counterparts and that Latino journalists share the norm of objectivity. Meanwhile, research suggests that individuals of different ethnicities turn to different communication channels and that this divergence can be explained by the degree of alignment in linguistic and cultural orientation. In this study, we therefore assess how linguistic differences of TV networks impact journalistic culture during the presidential primaries in 2016. As a crucial component of journalistic culture, we focus on journalistic role performance and find important distinctions: Findings reveal that the greater coverage of presidential candidates as sources on English-language networks have significant consequences for the roles journalists perform. Results suggest that the Spanish–language networks performed significantly more civic journalism roles than their English–language counterparts that perform an interventionist and service role. These differences are discussed alongside different audience-orientation of the networks that reflect deep racial and ethnic divides.  相似文献   

4.
The paper analyzes best-seller lists in seven major European book markets between April 2008 and March 2009. The paper’s authors introduce the concept of an impact factor for best-selling authors that shows how influential an author is in a given market and across the analyzed markets overall. The paper’s authors discovered that a new generation of European best-selling authors appeared in major book markets of Europe such that those not writing in English have an impact of almost twice that of the English writers. Furthermore, the authors have discovered that only veteran English or American best-selling authors tend to be published by big media conglomerates; the majority of the European best-selling authors were published by a surprising mix of big and small, independent and international publishing houses. It is striking that English as the most popular second language in the world did not play a stronger role as an intermediary language in the transmission of books from one European culture to another, as European publishers in major markets still employ editors who read a variety of languages and thus play the role of intermediaries in how books travel from one culture to another.  相似文献   

5.
论述文献数据库的标引规范和不同类型文献的标引模式;探讨自然语言与受控语言相结合的标引模式,以利于向受控语言的自动转化  相似文献   

6.
Language distribution in scientific communication reflects the influence of different languages on science in global perspective. The study, based on over 450 thousand scientific tweets of all publications indexed by Scopus in June 2015, reveals the language distribution in informal scientific communication. Moreover, this result is compared with the language distribution in formal scientific communication reflected in scientific publications. Results show: (1) The language of scientific tweets is concentrated in English (91%), Japanese (2.4%) and Spanish (1.7%), while the language of scientific publications is concentrated in English (90.6%), Chinese (5%) and German (1.1%). (2) Both scientific tweets and scientific publications present disciplinary differences in language distribution, reflecting the different amount of attention that authors of different languages have on certain disciplines. (3) Except Saudi Arabia, investigated countries all over the world, regardless of whether their native language is English or not, all have English scientific tweets in the dominant position. For the vast majority of these countries, the native language scientific tweets only rank the second position. (4) Overall, 26% of tweeters use more than one language to tweet scientific products, while 49% of scientific tweeters tweet everything in English only. The results indicate that English has undoubtedly become the lingua franca in informal scientific communication.  相似文献   

7.
本文介绍了一种基于知识的文献检索方法。该方法从文献的类型层次和文献的文件组织为出发点,建立了一种双重的文献模型,同时通过一个智能的、面向用户的导航搜索工具来帮助用户规范检索。文中对基于知识的文献检索体系进行了描述,并介绍了一种基于谓词的查询语言以及基于知识的搜索引擎的工作原理。  相似文献   

8.
英语教材与对外汉语教材是目前我国第二语言教材出版中规模最大、数量最多、体系最为完备的两个板块,在整个教材出版中占有非常重要的地位。作为第二语言教材的编辑,除要具备编辑的一般素质之外,还须有一些特别的素养与能力,如汉英双语能力、很高的政治敏感度、谙熟第二语言教学法及跨文化能力等。  相似文献   

9.
To cope with the fact that, in the ad hoc retrieval setting, documents relevant to a query could contain very few (short) parts (passages) with query-related information, researchers proposed passage-based document ranking approaches. We show that several of these retrieval methods can be understood, and new ones can be derived, using the same probabilistic model. We use language-model estimates to instantiate specific retrieval algorithms, and in doing so present a novel passage language model that integrates information from the containing document to an extent controlled by the estimated document homogeneity. Several document-homogeneity measures that we present yield passage language models that are more effective than the standard passage model for basic document retrieval and for constructing and utilizing passage-based relevance models; these relevance models also outperform a document-based relevance model. Finally, we demonstrate the merits in using the document-homogeneity measures for integrating document-query and passage-query similarity information for document retrieval.  相似文献   

10.
基于自然语言词对法的文献主题新颖性探测研究   总被引:1,自引:0,他引:1  
[目的/意义] 提出一个全新的量化指标--文档主题新颖度,通过自然语言词对方法对文献主题内容的新颖性进行探测研究,并探讨其可行性和优缺点以及新颖度与F1000推荐文献和引文指标之间的关系。[方法/过程] 以F1000为基础,选取hematology主题近一个月内推荐的文献,在Pubmed中查找并获取该推荐文献发表之前6个月内密切相关的文献,构成整个文献集。定义自然语言法新颖度的概念、计算公式并利用Oracle数据库PL/SQL语言进行编程,通过MetaMap软件提取自然语言词汇进行文献主题新颖度的运算。[结果/结论] 自然语言法在文献主题新颖性探测的运算上具有一定的可行性;文档主题新颖度与F1000推荐文献、引用情况并非成等价关系,分属于科技论文评价的不同维度、不同范畴,不可一概而论。应将文档主题新颖度这一新指标与同行评议情况和文献计量学等其他相关论文评价指标结合起来对文献进行综合评价分析,选取优质文献给予推荐。  相似文献   

11.
12.
用词上下文向量来表达文本集内一个词语与其他词语之间的上下文关系,并在词上下文向量的基础上生成分类器中所有类别的类别特征向量,以及待分类文本的特征向量,最后由分类器给出待分类文本的所属类别。实验显示,在类别特征向量和文本向量中融入词语上下文关系有助于改善文本分类效果。  相似文献   

13.
This paper describes features and methods for document image comparison and classification at the spatial layout level. The methods are useful for visual similarity based document retrieval as well as fast algorithms for initial document type classification without OCR. A novel feature set called interval encoding is introduced to capture elements of spatial layout. This feature set encodes region layout information in fixed-length vectors by capturing structural characteristics of the image. These fixed-length vectors are then compared to each other through a Manhattan distance computation for fast page layout comparison. The paper describes experiments and results to rank-order a set of document pages in terms of their layout similarity to a test document. We also demonstrate the usefulness of the features derived from interval coding in a hidden Markov model based page layout classification system that is trainable and extendible. The methods described in the paper can be used in various document retrieval tasks including visual similarity based retrieval, categorization and information extraction.  相似文献   

14.
XML retrieval is a departure from standard document retrieval in which each individual XML element, ranging from italicized words or phrases to full blown articles, is a retrievable unit. The distribution of XML element lengths is unlike what we usually observe in standard document collections, prompting us to revisit the issue of document length normalization. We perform a comparative analysis of arbitrary elements versus relevant elements, and show the importance of element length as a parameter for XML retrieval. Within the language modeling framework, we investigate a range of techniques that deal with length either directly or indirectly. We observe a length-bias introduced by the amount of smoothing, and show the importance of extreme length bias for XML retrieval. We also show that simply removing shorter elements from the index (by introducing a cut-off value) does not create an appropriate element length normalization. Even after restricting the minimal size of XML elements occurring in the index, the importance of an extreme explicit length bias remains.  相似文献   

15.
To our knowledge, no data exist on attitudes toward speakers with Japanese accented varieties of American English, an area of profound significance given increasing American‐Japanese contacts across a wide range of applied contexts. This “matched‐guise” study provides such by eliciting Americans’ reactions to a Japanese male talking on two different topics (aggressive versus neutral) using four language varieties (viz., standard, moderate‐accented, heavy‐accented, and disfluent). Speaker evaluations on status, attractiveness, and dynamism traits confirmed certain predictions based on the literature, but some surprising, yet interpretable, patterns emerged in this new domain of American‐Japanese inquiry. Specifically, it was found that Japanese‐accented speakers were evaluated in manner unlike all other non‐standard accented speakers of American English, except those of British and Malaysian background. It is suggested that perceptions of social group competitiveness may be responsible for this pattern of results which, in turn, is discussed in terms of its applied ramifications.  相似文献   

16.
Subtitles and captions have been used to aid second language learning. This study focuses on the effects of subtitles and captions on English Language Learners' ability to learn information literacy skills and apply those skills using an interactive tutorial. Three groups of Turkish university students majoring in English Language and Literature completed a tutorial on ACRL's Framework scholarly conversations. One group completed the tutorial with an English soundtrack and no titling; the second group completed the tutorial with an English soundtrack and English captions; and the third group completed the tutorial with an English soundtrack and Turkish subtitles. Using Morae software, the students were recorded and evaluated for time-on-task and correct completion of the interactive practice elements. The group that viewed the tutorial with an English soundtrack and Turkish subtitles completed tasks at a statistically significant faster pace than other groups and with statistically significant more success.  相似文献   

17.
胡静 《大观周刊》2012,(18):184-186
在英语作为第二语言的学习中,听、说是中国学生学习中普遍存在的薄弱环节,尤其是口语,受到语言环境的局限,更是其中最薄弱却难以突破的瓶颈。在学生们普遍基础薄弱、自律性较差的三本院校中,提高学生口语水平的难度就更大。文章以陕西科技大学镐京学院口语教学改革为例,详细介绍了镐京学院口语教学的探索过程,通过对不同阶段成败的分析,选择参与大规模的口语集中训练的学生展开调研,通过对结果进行对比分析得出结论,为英语口语课堂建设及教学方法提出建议。  相似文献   

18.
金悦  赵彦昌 《档案学研究》2022,36(6):136-143
照会是近代对外交往中最常用的文书形式。美国国家档案馆藏近代美国驻奉天领事馆领事报告档案中含有大量美国领事与中国东北当局的往来照会,这些照会内容丰富,中英文兼备且保存完整,其以外交文书的形式承载了丰富多元的历史内涵,因此具有极高的档案价值。本文通过细致爬梳领事照会档案,从发文方向、语言、事由、涉事方等角度对其进行细致分类,进而从文书体式及历史内涵等方面进行深入研究。该研究将有助于档案学、历史学更好地体认近代照会文书的程式与内容的演变以及其蕴含的深层历史脉络。中美往来照会逐渐由传统的繁复的程式向高效练达的风格演变,也成为中美外交文书和外交观念相互交融与影响的例证。  相似文献   

19.
Consistent with predictions of an economic model of international trade in media products, we show that in countries that have relatively high consumer spending on movies-notably the United States-domestically produced movies account for relatively large shares of theater box office receipts. We also find that American-produced movies account for relatively small market shares of the box office in high movie-spending foreign countries. We also find that English language fluency, or a dummy variable for non-U.S. countries whose native language is English, generally has an insignificant or marginally significant effect on these results.  相似文献   

20.
汉英双语标注集的研究与实现   总被引:1,自引:0,他引:1  
标注集是任何自然语言处理研究中的知识表达基础。本文结合汉英双向机器翻译开发和双语语料库加工的实践,提出了建立标准的汉英双语标注集的必要性,探讨了该标注集设计过程中遇到的几个关键问题并给出了一个比较完备的汉英双语标注集解决方案。实践证明,该方案具有良好的开放性和兼容性,对于汉英双向机器翻译系统和汉英双语语料库研究都具有适用性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号