首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
Blog feed search aims to identify a blog feed of recurring interest to users on a given topic. A blog feed, the retrieval unit for blog feed search, comprises blog posts of diverse topics. This topical diversity of blog feeds often causes performance deterioration of blog feed search. To alleviate the problem, this paper proposes several approaches based on passage retrieval, widely regarded as effective to handle topical diversity at document level in ad-hoc retrieval. We define the global and local evidence for blog feed search, which correspond to the document-level and passage-level evidence for passage retrieval, respectively, and investigate their influence on blog feed search, in terms of both initial retrieval and pseudo-relevance feedback. For initial retrieval, we propose a retrieval framework to integrate global evidence with local evidence. For pseudo-relevance feedback, we gather feedback information from the local evidence of the top K ranked blog feeds to capture diverse and accurate information related to a given topic. Experimental results show that our approaches using local evidence consistently and significantly outperform traditional ones.  相似文献   

2.
传统主流媒体作为科学传播的重要主体,在科学事件报道和科学知识科普中扮演着重要角色。为更好地了解传统主流媒体在社交媒体上进行在线科学传播的特点和效果,本文探究了主流媒体科学类博文的文本特征及其对传播效果的影响。首先,本研究获取了九大官方主流媒体于2021年全年在微博平台上所发布的全部11万余条博文,根据科学传播相关的关键词筛选出6000余条科学类博文。基于LDA对文本数据集进行主题建模分析,归纳出29个一级主题和7个二级主题,得到主流媒体科学传播的整体主题分布情况。具体主题所囊括的意涵显示,主流媒体既对科学发现和科技创新进行及时且持续的报道,也生产分发与大众密切相关的社会民生、健康等知识普及类内容。其次,本文对抽样得到的样本数据集进行基于人工编码的内容分析,得到样本中每条博文的情感立场和引用源。最后,本文对主题、情感立场和引用源三个文本特征与博文的转发、点赞、评论三个传播效果表征指标之间的关系进行研究。结果显示,主题和情感立场对博文的三个传播指标均产生显著影响,引用源则并无显著影响。主题为社会民生类科学知识普及、持正向情感立场的博文的传播效果显著好于其他博文。大众对具有不同文本特征的科学类博文有着不同的传播积极性,与日常生活知识科普密切相关、更容易引起情感共鸣的内容能够获得更好的传播效果。  相似文献   

3.
User generated content forms an important domain for mining knowledge. In this paper, we address the task of blog feed search: to find blogs that are principally devoted to a given topic, as opposed to blogs that merely happen to mention the topic in passing. The large number of blogs makes the blogosphere a challenging domain, both in terms of effectiveness and of storage and retrieval efficiency. We examine the effectiveness of an approach to blog feed search that is based on individual posts as indexing units (instead of full blogs). Working in the setting of a probabilistic language modeling approach to information retrieval, we model the blog feed search task by aggregating over a blogger’s posts to collect evidence of relevance to the topic and persistence of interest in the topic. This approach achieves state-of-the-art performance in terms of effectiveness. We then introduce a two-stage model where a pre-selection of candidate blogs is followed by a ranking step. The model integrates aggressive pruning techniques as well as very lean representations of the contents of blog posts, resulting in substantial gains in efficiency while maintaining effectiveness at a very competitive level.  相似文献   

4.
In the new media environment, hard news stories are no longer found solely in the “A” section of the paper or on the front page of a news Web site. They are now distributed widely, appearing in contexts as disparate as a partisan blog or your own e-mail inbox, forwarded by a friend. In this study, we investigate how the credibility of a news story is affected by the context in which it appears. Results of an experiment show a news story embedded in an uncivil partisan blog post appears more credible in contrast. Specifically, a blogger's incivility highlights the relative credibility of the newspaper article. We also find that incivility and partisan disagreement in an adjacent blog post produce stronger correlations between ratings of news and blog credibility. These findings suggest that news story credibility is affected by context and that these context effects can have surprising benefits for news organizations. Findings are consistent with predictions of social judgment theory.  相似文献   

5.
《Communication monographs》2012,79(4):511-534
The study reported here explored the social dimension of health-related blogs by examining blogging as a means to marshal social support and, as a result, achieve some of the health benefits associated with supportive communication. A total of 121 individuals who author a blog dedicated to their experience living with a specific health condition completed the study questionnaire. The number of blog posts made by respondents and proportion of posts with reader comments were positively associated with perceived social support from blog readers. The relationship between blog reader support and two outcomes related to well-being depended upon the support available in bloggers' strong-tie relationships with family and friends. Consistent with the social compensation (i.e., “poor get richer”) perspective, blog reader support was negatively associated with loneliness and positively associated with personal growth when support in strong-tie relationships was relatively lacking.  相似文献   

6.
Based on the hostile media effect (HME), this 2 (partisan opinion) × 2 (news source) × 2 (content valence) factorial experiment investigated how partisans (N = 132), in terms of perceived bias and credibility, assess same-sex marriage coverage by either an online mainstream news source or a citizen blog. Partisans who disagreed with the content's valence evaluated both mainstream online news and the blog posting as more biased and less credible than did partisans who agreed with the content's valence. The perceived reach of blog postings appears to generate a relative HME similar to that triggered by mainstream news. In particular, this study suggests that user-generated content—specifically blog postings—might generate a stronger relative HME than that observed with mainstream news.  相似文献   

7.
提出对相关的免费信息网站进行可信度分析后,来自可信度比较高的网站的信息才能成为课题的相关对比文献。通过建立专家小组,结合一般网站评价指标和查新需求指标,利用9分制打分法确定可信度评价指标,从准确性、可证实性、相关性、权威性来进行分析,利用AHP方法设置各层指标权重,将可信度高的网站纳入该行业内查新的参考网站集合,并结合案例说明该方法的可行性。  相似文献   

8.
The influential Text REtrieval Conference (TREC) retrieval conference has always relied upon specialist assessors or occasionally participating groups to create relevance judgements for the tracks that it runs. Recently however, crowdsourcing has been championed as a cheap, fast and effective alternative to traditional TREC-like assessments. In 2010, TREC tracks experimented with crowdsourcing for the very first time. In this paper, we report our successful experience in creating relevance assessments for the TREC Blog track 2010 top news stories task using crowdsourcing. In particular, we crowdsourced both real-time newsworthiness assessments for news stories as well as traditional relevance assessments for blog posts. We conclude that crowdsourcing not only appears to be a feasible, but also cheap and fast means to generate relevance assessments. Furthermore, we detail our experiences running the crowdsourced evaluation of the TREC Blog track, discuss the lessons learned, and provide best practices.  相似文献   

9.
以科学网博客频道中论文交流栏目下的博文为对象,从其原创性、语种、交流内容、交流效果等方面设计了5个层次、包括44个具体指标的调研项目。研究结果表明:博文作为一种新的信息资源,其中不仅含有大量传统学术资源,还有许多传统学术资源中不可获取的帮助理解和吸收学术成果的相关信息,图书情报机构应给予关注。  相似文献   

10.
The researchers examined student perceptions of campus and community newspaper credibility at the University of Florida using a Web survey (n = 1,906) of those enrolled in a general education class. A moderate correlation (r = .28) existed between college newspaper credibility and community newspaper credibility. Using hierarchical linear regression, the researchers found interest in news content to be a statistically significant predictor of credibility for both local newspapers and college newspapers. In addition, students whose parents encouraged them to read a newspaper found both newspapers more credible than did their peers, and exposure to a newspaper was found to be a strong predictor of credibility for that newspaper. Finally, the results of this case study also suggest White respondents find local newspapers more credible than other races. Implications for researchers and practitioners were discussed.  相似文献   

11.
A metric analysis of blogs on library and information science (LIS) between November 2006 and June 2009 indexed on the Libworm search engine characterizes the community's behavior quantitatively. An analysis of 1108 personal and corporate blogs with a total of 275,103 posts is used to calculate survival rate, production (number of posts published), and visibility via such indicators as links received, Technorati authority, and Google's PagePank. Over the study period, there was a 52% decrease in the number of active blogs. Despite the drop in production over this period, the average number of posts per blog remained constant (14 per month). The most representative blogs in the discipline are identified. The emergence of such platforms as Facebook and Twitter seems to have meant that both personal and corporate blogs have lost some of their prominence.  相似文献   

12.
The Cross-Language Evaluation Forum has encouraged research in text retrieval methods for numerous European languages and has developed durable test suites that allow language-specific techniques to be investigated and compared. The labor associated with crafting a retrieval system that takes advantage of sophisticated linguistic methods is daunting. We examine whether language-neutral methods can achieve accuracy comparable to language-specific methods with less concomitant software complexity. Using the CLEF 2002 test set we demonstrate empirically how overlapping character n-gram tokenization can provide retrieval accuracy that rivals the best current language-specific approaches for European languages. We show that n = 4 is a good choice for those languages, and document the increased storage and time requirements of the technique. We report on the benefits of and challenges posed by n-grams, and explain peculiarities attendant to bilingual retrieval. Our findings demonstrate clearly that accuracy using n-gram indexing rivals or exceeds accuracy using unnormalized words, for both monolingual and bilingual retrieval.  相似文献   

13.
浅谈博客作为竞争情报信息源的可靠性   总被引:1,自引:1,他引:0  
随着网络技术的发展,博客已成为重要的信息源。指出博客作为竞争情报信息源存在风险性,并从博客信息资源的海量性、博客内容涉及的广泛性、博客身份的复杂性及博客发布的目的四个方面进行阐述。进而,从博客的可信度、博客内容的可信度、博客作为信息源信息加工的风险性三个角度对博客作为信息源的可靠性进行分析。  相似文献   

14.
This research analyzed a dataset of academic libraries' posts on Facebook. It applied a text and data analytics approach to a dataset collected from the Facebook posts of academic libraries at the top 100 English-speaking universities, as listed by the 2014 Shanghai World University Rankings. The dataset is from a two-year posting history of 18,333 unique posts, 113,621 likes, and 3401 comments. Less than a quarter of the libraries had more than 2000 post-related likes, and only seven received more than 100 comments on their postings. Content analysis identified the most prevalent single word (unigrams), bigrams (two-word sequences), and trigrams (three-word sequences) in high and low engagement content. Semantic analysis identified the semantic categories for posts with high and low engagement. The findings can assist academic libraries in their social media strategies for engagement, marketing, and visibility.  相似文献   

15.
We adapt the cluster hypothesis for score-based information retrieval by claiming that closely related documents should have similar scores. Given a retrieval from an arbitrary system, we describe an algorithm which directly optimizes this objective by adjusting retrieval scores so that topically related documents receive similar scores. We refer to this process as score regularization. Because score regularization operates on retrieval scores, regardless of their origin, we can apply the technique to arbitrary initial retrieval rankings. Document rankings derived from regularized scores, when compared to rankings derived from un-regularized scores, consistently and significantly result in improved performance given a variety of baseline retrieval algorithms. We also present several proofs demonstrating that regularization generalizes methods such as pseudo-relevance feedback, document expansion, and cluster-based retrieval. Because of these strong empirical and theoretical results, we argue for the adoption of score regularization as general design principle or post-processing step for information retrieval systems.
Fernando DiazEmail:
  相似文献   

16.
A huge volume of news stories are reported by various news channels, on a daily basis. Subscribing to all the stories and keeping track of the important ones day after day is very time-consuming. This paper proposes several approaches to identify important news stories. To this end, we take advantage of the blogosphere as an information source to evaluate the importance of news stories. Blogs reflect the diverse opinions of bloggers about news stories, and the attention that these stories receive can help estimate the importance of the stories. In this paper, we define the popularity of a news story in the blogosphere as the attention it attracts from users. We measure popularity of the stories in the blogosphere from two viewpoints: content and a timeline. In terms of content, we suggest several approaches to estimate language models for a news story and blog posts, and we evaluate the importance of the story using these language models. Furthermore, we generate a temporal profile of a news story by exploring the timeline of blog posts related to the story, and evaluate its importance based on the temporal profile. We experimentally verify the effectiveness of the proposed approaches for identifying top news stories.  相似文献   

17.
ABSTRACT

This case study will take readers through the planning and publication process of a collaborative departmental library blog at Syracuse University, which is a large private, non-profit research intensive university located in central New York State. It will provide an overview of the history of the project and the mission of the blog. It will describe the technical aspects, developing a publication schedule, and the editorial responsibilities of maintaining the blog. The impact of the blog is documented. The blog has raised awareness of the librarians' expertise and this is explored alongside how posts have contributed to a number of wider conversations in librarianship.  相似文献   

18.
清华大学图书馆学科博客探索实践及理性思考   总被引:6,自引:1,他引:5  
为适应学科服务发展需要,清华大学图书馆进行新闻传播学科博客实践探索。新闻传播学科博客以类目设置为纲,在发布学科信息的基础上对学科资源进行有效组织。从实际使用效果看,博客在及时性、累积性、共享性和操作性等方面较充分地发挥了博客所具有的优势,然而,学科博客在互动性上略显不足。对此笔者从博客的交流特性着手进行分析,并总结学科博客建设过程中的经验和教训。   相似文献   

19.
博客搜索引擎与传统搜索引擎的比较研究   总被引:8,自引:0,他引:8  
简要介绍了博客与国内外著名博客搜索引擎,针对博客搜索引擎与传统搜索引擎的不同,从工作原理、检索内容与检索方式三个方面对两种搜索引擎进行了系统的分析与比较,并选取了四个不同方面的具有代表性的主题,对两种搜索引擎的代表进行了检索功能和检索性能方面的测评,最后指出了两种搜索引擎在资源价值、检索方式、个性化服务等方面的各自的优势与不足之处,以期对两种搜索引擎的改进提供借鉴。  相似文献   

20.
In an experimental study conducted in Switzerland the effects of newscasters' gender and age on credibility were analyzed using a 2 × 2 × 2 factorial design. Participants (N = 160) evaluated Swiss, German, and Austrian TV news items in terms of credibility of the newscaster and credibility of the message. News items read by female newscasters were perceived as being more credible. In contrast, male newscasters were considered to be more credible persons. Furthermore, a significant interaction between the newscasters' gender and age was observed: Age had no effect on the credibility of the younger newscasters, whereas older male newscasters were perceived as being the most credible.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号