Quantifying the Limits and Success of Extractive Summarization Systems Across Domains

Description:

This paper analyzes the topic identification stage of single-document automatic text summarization across four different domains, consisting of newswire, literary, scientific and legal documents.

Creator(s):
Creation Date: June 2010
Partner(s):
UNT College of Engineering
Collection(s):
UNT Scholarly Works
Usage:
Total Uses: 53
Past 30 days: 1
Yesterday: 0
Creator (Author):
Ceylan, Hakan

University of North Texas

Creator (Author):
Mihalcea, Rada, 1974-

University of North Texas

Creator (Author):
Ozertem, Umut

Yahoo! Labs

Creator (Author):
Lloret, Elena

University of Alicante

Creator (Author):
Palomar, Manuel

University of Alicante

Date(s):
  • Creation: June 2010
Description:

This paper analyzes the topic identification stage of single-document automatic text summarization across four different domains, consisting of newswire, literary, scientific and legal documents.

Degree:
Note:

Abstract: This paper analyzes the topic identification stage of single-document automatic text summarization across four different domains, consisting of newswire, literary, scientific and legal documents. The authors present a study that explores the summary space of each domain via an exhaustive search strategy, and finds the probability density function (pdf) of the ROUGE score distributions for each domain. The authors then use this pdf to calculate the percentile rank of extractive summarization systems. Their results introduce a new way to judge the success of automatic summarization systems and bring quantified explanations to questions such as why it was so hard for the systems to date to have a statistically significant improvement over the lead baseline in the news domain.

Physical Description:

9 p.

Language(s):
Subject(s):
Keyword(s): extractive summarizations | automatic text summarizations
Source: Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2010, Los Angeles, California, United States
Contributor(s):
Partner:
UNT College of Engineering
Collection:
UNT Scholarly Works
Identifier:
  • ARK: ark:/67531/metadc31026
Resource Type: Paper
Format: Text
Rights:
Access: Public