首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Identifying crisis-related informative tweets using learning on distributions
Institution:1. Computer Network Information Center, Chinese Academy of Sciences, Beijing, China;2. University of Chinese Academy of Sciences, Beijing, China;3. Institute of Software, Chinese Academy of Sciences, Beijing, China;4. State Grid Energy Research Institute, Beijing, China;1. State Key Lab of Mathematical Engineering and Advanced Computing, 450001 China;2. School of Cyber Science and Engineering,Wuhan University, 430079 China;3. Zhengzhou University of Light Industry, 450002, China;1. Sorbonne Université, CNRS, LIP6, Paris F-75005, France;2. Université Paris-Saclay, CNRS, ISP, ENS Paris-Saclay, Cachan, France;1. School of Mathematical Sciences, University of Adelaide, SA 5005, Australia;2. ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS), Australia;3. Data to Decisions Cooperative Research Centre (D2D CRC), Kent Town, SA 5067, Australia;4. D2D CRC stream lead, Australia
Abstract:Social networks like Twitter are good means for people to express themselves and ask for help in times of crisis. However, to provide help, authorities need to identify informative posts on the network from the vast amount of non-informative ones to better know what is actually happening. Traditional methods for identifying informative posts put emphasis on the presence or absence of certain words which has limitations for classifying these posts. In contrast, in this paper, we propose to consider the (overall) distribution of words in the post. To do this, based on the distributional hypothesis in linguistics, we assume that each tweet is a distribution from which we have drawn a sample of words. Building on recent developments in learning methods, namely learning on distributions, we propose an approach which identifies informative tweets by using distributional assumption. Extensive experiments have been performed on Twitter data from more than 20 crisis incidents of nearly all types of incidents. These experiments show the superiority of the proposed approach in a number of real crisis incidents. This implies that better modelling of the content of a tweet based on recent advances in estimating distributions and using domain-specific knowledge for various types of crisis incidents such as floods or earthquakes, may help to achieve higher accuracy in the task.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号