首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Single-document and multi-document summarization techniques for email threads using sentence compression
Authors:David M Zajic  Bonnie J Dorr  Jimmy Lin
Institution:1. Department of Computer Science, University of Maryland, College Park, MD 20742, United States;2. College of Information Studies, University of Maryland, College Park, MD 20742, United States
Abstract:We present two approaches to email thread summarization: collective message summarization (CMS) applies a multi-document summarization approach, while individual message summarization (IMS) treats the problem as a sequence of single-document summarization tasks. Both approaches are implemented in our general framework driven by sentence compression. Instead of a purely extractive approach, we employ linguistic and statistical methods to generate multiple compressions, and then select from those candidates to produce a final summary. We demonstrate these ideas on the Enron email collection – a very challenging corpus because of the highly technical language. Experimental results point to two findings: that CMS represents a better approach to email thread summarization, and that current sentence compression techniques do not improve summarization performance in this genre.
Keywords:Email summarization  Sentence compression  Trimming  Enron  Informal media
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号