首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
基于主题细分的社交网络用户间交互特征分析   总被引:1,自引:0,他引:1  
[目的/意义]针对一微博子网,从主题细分的角度对用户间历史交互记录进行研究,发现用户间交互的主题偏好特征,以期从微观层面了解用户信息传播行为的规律。[方法/过程]通过用户实例分析得出对用户间交互进行主题细分的必要性;利用主题模型(LDA)对用户间历史交互记录进行主题细分,采用多维向量表示用户间在不同主题下的交互强度;通过统计分析和网络分析方法探索用户间交互的主题特征。[结果/结论]各主题下用户间交互强度的分布具有长尾特征;用户间的交互内容在时序上具有主题相关性;基于多维的用户间交互强度,可抽取出特定主题下的用户交互子网。用户间交互在时序上具有主题相关性这一特征,以及特定主题的用户交互子网,可用于对特定主题的信息传播进行监控和预测。  相似文献   

2.
《Research Policy》2022,51(3):104456
While there are numerous studies of university technology transfer, there have been relatively few studies of technology transfer at federal labs. Moreover, studies of university technology transfer have focused on faculty, not post-doctoral scientists. They have also ignored identity and sensemaking theories in organizational behavior, which are relevant in the context of technology transfer. We fill these gaps by examining differences between university post-doctoral scientists and federal lab post-doctoral scientists, in terms of how they engage in technology transfer. Our qualitative analysis is based on extensive interviews of post-doctoral scientists and their supervisors/principal investigators (PIs) at two major research universities and four large federal labs. We find that federal lab scientists are more influenced by mission-driven research and their sense of public service, as compared to university scientists who are motivated more by curiosity-driven research. These motivational differences may constitute significant barriers to technology transfer in federal labs. As compared to their university counterparts, federal lab scientists appear to experience more cognitive dissonance in pursuing commercialization of their research and have more sophisticated resolution strategies for dealing with such dissonance. We also find that PIs at federal labs are not highly incentivized to engage in technology transfer. We discuss additional research needs, as well as the managerial and training implications of our findings.  相似文献   

3.
While test collections provide the cornerstone for Cranfield-based evaluation of information retrieval (IR) systems, it has become practically infeasible to rely on traditional pooling techniques to construct test collections at the scale of today’s massive document collections (e.g., ClueWeb12’s 700M+ Webpages). This has motivated a flurry of studies proposing more cost-effective yet reliable IR evaluation methods. In this paper, we propose a new intelligent topic selection method which reduces the number of search topics (and thereby costly human relevance judgments) needed for reliable IR evaluation. To rigorously assess our method, we integrate previously disparate lines of research on intelligent topic selection and deep vs. shallow judging (i.e., whether it is more cost-effective to collect many relevance judgments for a few topics or a few judgments for many topics). While prior work on intelligent topic selection has never been evaluated against shallow judging baselines, prior work on deep vs. shallow judging has largely argued for shallowed judging, but assuming random topic selection. We argue that for evaluating any topic selection method, ultimately one must ask whether it is actually useful to select topics, or should one simply perform shallow judging over many topics? In seeking a rigorous answer to this over-arching question, we conduct a comprehensive investigation over a set of relevant factors never previously studied together: 1) method of topic selection; 2) the effect of topic familiarity on human judging speed; and 3) how different topic generation processes (requiring varying human effort) impact (i) budget utilization and (ii) the resultant quality of judgments. Experiments on NIST TREC Robust 2003 and Robust 2004 test collections show that not only can we reliably evaluate IR systems with fewer topics, but also that: 1) when topics are intelligently selected, deep judging is often more cost-effective than shallow judging in evaluation reliability; and 2) topic familiarity and topic generation costs greatly impact the evaluation cost vs. reliability trade-off. Our findings challenge conventional wisdom in showing that deep judging is often preferable to shallow judging when topics are selected intelligently.  相似文献   

4.
There is no doubt that scientific discoveries have always brought changes to society. New technologies help solve social problems such as transportation and education, while research brings benefits such as curing diseases and improving food production. Despite the impacts caused by science and society on each other, this relationship is rarely studied and they are often seen as different universes. Previous literature focuses only on a single domain, detecting social demands or research fronts for example, without ever crossing the results for new insights. In this work, we create a system that is able to assess the relationship between social and scholar data using the topics discussed in social networks and research topics. We use the articles as science sensors and humans as social sensors via social networks. Topic modeling algorithms are used to extract and label social subjects and research themes and then topic correlation metrics are used to create links between them if they have a significant relationship. The proposed system is based on topic modeling, labeling and correlation from heterogeneous sources, so it can be used in a variety of scenarios. We make an evaluation of the approach using a large-scale Twitter corpus combined with a PubMed article corpus. In both of them, we work with data of the Zika epidemic in the world, as this scenario provides topics and discussions on both domains. Our work was capable of discovering links between various topics of different domains, which suggests that some of the relationships can be automatically inferred by the sensors. Results can open new opportunities for forecasting social behavior, assess community interest in a scientific subject or directing research to the population welfare.  相似文献   

5.
对学科领域研究主题优先级进行战略分析,能够帮助科研人员及科研管理决策部门快速了解学科领域的研究态势、发现科学前沿,对提高科研产出起到积极的支持和促进作用。本文以图书情报学研究主题为例,采用主题提取与趋势分析相结合的方法,在提取学科主题基础上,从发文趋势和引文趋势两个维度,绘制含“研究贫乏区、热点区、冷点区、过热区”的我国图书情报学领域研究主题战略坐标。研究表明:本文提出的趋势战略坐标能够有效展示学科领域不同研究主题的发展阶段,全面、细致地呈现不同研究主题的发展等级。  相似文献   

6.
With the emergence and development of deep generative models, such as the variational auto-encoders (VAEs), the research on topic modeling successfully extends to a new area: neural topic modeling, which aims to learn disentangled topics to understand the data better. However, the original VAE framework had been shown to be limited in disentanglement performance, bringing their inherent defects to a neural topic model (NTM). In this paper, we put forward that the optimization objectives of contrastive learning are consistent with two important goals (alignment and uniformity) of well-disentangled topic learning. Also, the optimization objectives of contrastive learning are consistent with two key evaluation measures for topic models, topic coherence and topic diversity. So, we come to the important conclusion that alignment and uniformity of disentangled topic learning can be quantified with topic coherence and topic diversity. Accordingly, we are inspired to propose the Contrastive Disentangled Neural Topic Model (CNTM). By representing both words and topics as low-dimensional vectors in the same embedding space, we apply contrastive learning to neural topic modeling to produce factorized and disentangled topics in an interpretable manner. We compare our proposed CNTM with strong baseline models on widely-used metrics. Our model achieves the best topic coherence scores under the most general evaluation setting (100% proportion topic selected) with 25.0%, 10.9%, 24.6%, and 51.3% improvements above the second-best models’ scores reported on four datasets of 20 Newsgroups, Web Snippets, Tag My News, and Reuters, respectively. Our method also gets the second-best topic diversity scores on the dataset of 20Newsgroups and Web Snippets. Our experimental results show that CNTM can effectively leverage the disentanglement ability from contrastive learning to solve the inherent defect of neural topic modeling and obtain better topic quality.  相似文献   

7.
The addresses of national leaders can affect their public support and spur changes in the country's economy. To date, very few studies exist establishing these relationships, and no research has been done on the addresses from Vladimir Putin. In this paper we fill this knowledge gap by analysing the nationwide phone-ins of Putin, a special annual format where he addresses the public, and using structural topic modelling studying their topics over time. Furthermore, we relate these topics to public approval of the president and the government as well as to some Russian macroeconomic indicators such as inflation and budget expenditures. Based on our data containing 1938 responses and almost 250 thousand words, we identify 16 main topics covering areas from healthcare and education through economics to elections and legislation. We find that the topic of foreign affairs has gained in popularity over time the most (from around 4.5% at the beginning to more than 10% starting from 2014). Another topic, consistently gaining weight in the president's statements, is related to solving particular problems of the general public (from 8% to 12.5%) and is significantly correlated with subsequent decrease in the country's unemployment (Pearson's correlation coefficient -0.502). We also find that when the government's support is decreasing, Putin tends to discuss more socially significant topics (e.g., inflation, healthcare, Pearson's coef. around -0.5), while when the support is rising, he speaks more about foreign affairs (Pearson's coef. 0.773). Our study provides first evidence that Vladimir Putin may adapt the content of his phone-in meetings to gather public support and influence the country's economy.  相似文献   

8.
Research trends are the keys for researchers to decide their research agenda. However, only a few works have tried to quantify how scholars follow the research trends. We address this question by proposing a novel measurement for quantifying how a scientific entity (paper or researcher) follows the hot topics in a research field. Based on extended dynamic topic modeling, the degree of hotness tracing of papers and scholars is explored from three perspectives: mainstream, short-term direction, and long-term direction. By analyzing a large-scale dataset containing more than 270,000 papers and 45,000 authors in Computer Vision (CV), we found that the authors’ orientation is more in the established mainstream rather than based on incremental directions and makes little difference in the choice of long-term or short-term direction. Moreover, we identified six groups of researchers in the CV community by clustering research behavior, who differed significantly in their patterns of orientation, topic selection, and impact. This study provides a new quantitative method for analyzing topic trends and scholars’ research interests, capturing the diversity of research behavior patterns to address the phenomenon of canonical and ubiquitous progress in research fields.  相似文献   

9.
提出一种基于LDA主题模型的科技新闻主题分析方法,选取2009—2018年中、澳、英、美4国极地科考新闻数据,从主题类型和主题强度角度进行主题演化分析。在中文新闻中,极地测绘等主题的热度上升,极地冰川科考主题的热度下降;在英文新闻中,热门主题为极地冰川科考与极地海洋科考;其余主题热度相对稳定。研究结果表明,该方法可以有效识别科技新闻主题并揭示其演化趋势,可以有效改善网络环境下科技情报分析的自动化程度。  相似文献   

10.
Inferring users’ interests from their activities on social networks has been an emerging research topic in the recent years. Most existing approaches heavily rely on the explicit contributions (posts) of a user and overlook users’ implicit interests, i.e., those potential user interests that the user did not explicitly mention but might have interest in. Given a set of active topics present in a social network in a specified time interval, our goal is to build an interest profile for a user over these topics by considering both explicit and implicit interests of the user. The reason for this is that the interests of free-riders and cold start users who constitute a large majority of social network users, cannot be directly identified from their explicit contributions to the social network. Specifically, to infer users’ implicit interests, we propose a graph-based link prediction schema that operates over a representation model consisting of three types of information: user explicit contributions to topics, relationships between users, and the relatedness between topics. Through extensive experiments on different variants of our representation model and considering both homogeneous and heterogeneous link prediction, we investigate how topic relatedness and users’ homophily relation impact the quality of inferring users’ implicit interests. Comparison with state-of-the-art baselines on a real-world Twitter dataset demonstrates the effectiveness of our model in inferring users’ interests in terms of perplexity and in the context of retweet prediction application. Moreover, we further show that the impact of our work is especially meaningful when considered in case of free-riders and cold start users.  相似文献   

11.
词频分析法用于我国纳米科技研究动向分析   总被引:41,自引:6,他引:41  
参照加拿大国家研究理事会 (NRC)提供的 79个纳米科技关键词 ,基于中国期刊网题录数据库和中国专利信息数据库 ,本文采用关键词词频分析方法 ,展示了过去八年我国纳米科技成果的分布 ,勾勒出我国纳米科技的研究领域 ,分析了研究热点和研究弱项。  相似文献   

12.
E-petitions have become a popular vehicle for political activism, but studying them has been difficult because efficient methods for analyzing their content are currently lacking. Researchers have used topic modeling for content analysis, but current practices carry some serious limitations. While modeling may be more efficient than manually reading each petition, it generally relies on unsupervised machine learning and so requires a dependable training and validation process. And so this paper describes a framework to train and validate Latent Dirichlet Allocation (LDA), the simplest and most popular topic modeling algorithm, using e-petition data. With rigorous training and evaluation, 87% of LDA-generated topics made sense to human judges. Topics also aligned well with results from an independent content analysis by the Pew Research Center, and were strongly associated with corresponding social events. Computer-assisted content analysts can benefit from our guidelines to supervise every process of training and evaluation of LDA. Software developers can benefit from learning the demands of social scientists when using LDA for content analysis. These findings have significant implications for developing LDA tools and assuring validity and interpretability of LDA content analysis. In addition, LDA topics can have some advantages over subjects extracted by manual content analysis by reflecting multiple themes expressed in texts, by extracting new themes that are not highlighted by human coders, and by being less prone to human bias.  相似文献   

13.
Information filtering has been a major task of study in the field of information retrieval (IR) for a long time, focusing on filtering well-formed documents such as news articles. Recently, more interest was directed towards applying filtering tasks to user-generated content such as microblogs. Several earlier studies investigated microblog filtering for focused topics. Another vital filtering scenario in microblogs targets the detection of posts that are relevant to long-standing broad and dynamic topics, i.e., topics spanning several subtopics that change over time. This type of filtering in microblogs is essential for many applications such as social studies on large events and news tracking of temporal topics. In this paper, we introduce an adaptive microblog filtering task that focuses on tracking topics of broad and dynamic nature. We propose an entirely-unsupervised approach that adapts to new aspects of the topic to retrieve relevant microblogs. We evaluated our filtering approach using 6 broad topics, each tested on 4 different time periods over 4 months. Experimental results showed that, on average, our approach achieved 84% increase in recall relative to the baseline approach, while maintaining an acceptable precision that showed a drop of about 8%. Our filtering method is currently implemented on TweetMogaz, a news portal generated from tweets. The website compiles the stream of Arabic tweets and detects the relevant tweets to different regions in the Middle East to be presented in the form of comprehensive reports that include top stories and news in each region.  相似文献   

14.
Social commerce sites (SCSs), a new model of social media, provide fertile ground for customers to communicate their opinions and exchange product- or service- related information. Given the significant opportunities related to the use of social media data for customers’ insight, we explore the factors driving information sharing behavior on SCSs. In this paper, we propose and empirically test a comprehensive theoretical model for customer information sharing behavior through analysis of online survey data as well as network and behavioral usage data of over four months from 1177 customers in a SCS. The research model was empirically validated with the use of both subjective and objective data in a longitudinal setting. Our results show that customer information sharing is influenced by both individual (i.e., reputation and the enjoyment of helping others) and social capital (i.e., out-degrees’ post, in-degrees’ feedback, customer expertise and reciprocity) factors. This study contributes to the existing literature by highlighting the role of directed social network in customer information sharing behavior on SCSs. We believe that the results of our study offer important insights to the IS research and practice.  相似文献   

15.
We propose a topic-dependent attention model for sentiment classification and topic extraction. Our model assumes that a global topic embedding is shared across documents and employs an attention mechanism to derive local topic embedding for words and sentences. These are subsequently incorporated in a modified Gated Recurrent Unit (GRU) for sentiment classification and extraction of topics bearing different sentiment polarities. Those topics emerge from the words’ local topic embeddings learned by the internal attention of the GRU cells in the context of a multi-task learning framework. In this paper, we present the hierarchical architecture, the new GRU unit and the experiments conducted on users’ reviews which demonstrate classification performance on a par with the state-of-the-art methodologies for sentiment classification and topic coherence outperforming the current approaches for supervised topic extraction. In addition, our model is able to extract coherent aspect-sentiment clusters despite using no aspect-level annotations for training.  相似文献   

16.
When researchers disclose their original data, they can enhance the visibility of their research works and gain more academic credits (credit effect). By contrast, doing so may accelerate the knowledge replacement process, which dissipates the academic credit that their research works may have received (competition effect). In this study, we examine whether and the extent to which scientists gain academic credit for their research works by publicly disclosing their data. Our review of various literature hypothesizes that data-disclosing research gains more academic credit than non-data-disclosing research in the short term. However, this difference gradually disappears and reverses as the competition effect emerges. This pattern is expected to systematically differ depending on the academic reputation of the journals where the data-disclosing research is published. We empirically test the derived hypotheses by analyzing the metadata of over 310,000 Web of Science Core Collection (WoS CC)-indexed journal articles published in 2010. Our analysis supports both hypotheses. The present study contributes to the on-going policy discussion about the need for institutional measures to promote disclosure of research data by scientists.  相似文献   

17.
“We the Media” networks are real time and open, and such networks lack a gatekeeper system. As netizens’ comments on emergency events are disseminated, negative public opinion topics and confrontations concerning those events also spread widely on “We the Media” networks. Gradually, this phenomenon has attracted scholarly attention, and all social circles attach importance to the phenomenon as well. In existing topic detection studies, a topic is mainly defined as an "event" from the perspective of news-media information flow, but in the “We the Media” era, there are often many different views or topics surrounding a specific public opinion event. In this paper, a study on the detection of public opinion topics in “We the Media” networks is presented, starting with the characteristics of the elements found in public opinions on “We the Media” networks; such public opinions are multidimensional, multilayered and possess multiple attributes. By categorizing the elements’ attributes using social psychology and system science categories as references, we build a multidimensional network model oriented toward the topology of public opinions on “We the Media” networks. Based on the real process by which multiple topics concerning the same event are generated and disseminated, we designed a topic detection algorithm that works on these multidimensional public opinion networks. As a case study, the “Explosion in Tianjin Port on August 12, 2015″ accident was selected to conduct empirical analyses on the algorithm's effectiveness. The theoretical and empirical research findings of this paper are summarized along the following three aspects. 1. The multidimensional network model can be used to effectively characterize the communication characteristics of multiple topics on “We the Media” networks, and it provided the modeling ideas for the present paper and for other related studies on “We the Media” public opinion networks. 2. Using the multidimensional topic detection algorithm, 70% of the public opinion topics concerning the case study event were effectively detected, which shows that the algorithm is effective at detecting topics from the information flow on “We the Media” networks. 3. By defining the psychological scores of single and paired Chinese keywords in public opinion information, the topic detection algorithm can also be used to judge the sentiment tendencies of each topic, which can facilitate a timely understanding of public opinion and reveal negative topics under discussion on “We the Media” networks.  相似文献   

18.
Social media data have recently attracted considerable attention as an emerging voice of the customer as it has rapidly become a channel for exchanging and storing customer-generated, large-scale, and unregulated voices about products. Although product planning studies using social media data have used systematic methods for product planning, their methods have limitations, such as the difficulty of identifying latent product features due to the use of only term-level analysis and insufficient consideration of opportunity potential analysis of the identified features. Therefore, an opportunity mining approach is proposed in this study to identify product opportunities based on topic modeling and sentiment analysis of social media data. For a multifunctional product, this approach can identify latent product topics discussed by product customers in social media using topic modeling, thereby quantifying the importance of each product topic. Next, the satisfaction level of each product topic is evaluated using sentiment analysis. Finally, the opportunity value and improvement direction of each product topic from a customer-centered view are identified by an opportunity algorithm based on product topics’ importance and satisfaction. We expect that our approach for product planning will contribute to the systematic identification of product opportunities from large-scale customer-generated social media data and will be used as a real-time monitoring tool for changing customer needs analysis in rapidly evolving product environments.  相似文献   

19.
朱光  潘高枝  李凤景 《情报科学》2022,40(4):127-137
【目的/意义】识别信息隐私研究领域的热点主题,梳理主题演化路径。【方法/过程】针对主题识别语义杂乱 等问题,提出时序关联与结构表征视角下的主题演化分析方法。首先利用LDA(Latent Dirichlet Allocation)模型识 别多时间窗口下的文献主题,进一步运用共词分析绘制语义更为独立的主题凝聚子群。在此基础上,从时序关联 维度计算相邻窗口下主题间的相似度,梳理演化路径;从结构表征维度,设计主题新颖度、中心性、影响力等计量指 标,探寻信息隐私前沿和热点主题的演化变迁。【结果/结论】实证分析结果表明,本文方法可以深度挖掘信息隐私 领域研究主题,从宏微观两个维度全面梳理主题的演化路径。研究有利于探测信息隐私研究的前沿。【创新/局限】 综合运用LDA主题模型与共词分析方法绘制主题凝聚子群,从时序演化和结构表征两个维度探寻主题演化路径。 未来研究中有待于引入多种数据源以对比主题差异,有待于引入多元组术语改善主题识别效果。  相似文献   

20.
Scientific knowledge dynamics and relatedness in biotech cities   总被引:4,自引:0,他引:4  
This paper investigates the impact of scientific relatedness on knowledge dynamics in biotech at the city level during the period 1989–2008. We assess the extent to which the emergence of new research topics and the disappearance of existing topics in cities are dependent on their degree of scientific relatedness with existing topics in those cities. We make use of the rise and fall of title words in scientific publications in biotech to identify major cognitive developments within the field. We determined the degree of relatedness between 1028 scientific topics in biotech by means of co-occurrence of pairs of topics in journal articles. We combined this relatedness indicator between topics in biotech with the scientific portfolio of cities (i.e. the topics on which they published previously) to determine how cognitively close a potentially new topic (or an existing topic) is to the scientific portfolio of a city. We analyzed knowledge dynamics at the city level by looking at the entry and exit of topics in the scientific portfolio of 276 cities in the world. We found strong and robust evidence that new scientific topics in biotech tend to emerge systematically in cities where scientifically related topics already exist, while existing scientific topics had a higher probability to disappear from a city when these were weakly related to the scientific portfolio of the city.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号