Language Independent Extractive Summarization Page: 1
The following text was automatically extracted from the image on this page using optical character recognition software:
Language Independent Extractive Summarization
Department of Computer Science and Engineering
University of North Texas
We demonstrate TextRank - a system for
unsupervised extractive summarization that
relies on the application of iterative graph-
based ranking algorithms to graphs encod-
ing the cohesive structure of a text. An im-
portant characteristic of the system is that
it does not rely on any language-specific
knowledge resources or any manually con-
structed training data, and thus it is highly
portable to new languages or domains.
Given the overwhelming amount of information avail-
able today, on the Web and elsewhere, techniques
for efficient automatic text summarization are essen-
tial to improve the access to such information. Al-
gorithms for extractive summarization are typically
based on techniques for sentence extraction, and at-
tempt to identify the set of sentences that are most
important for the understanding of a given document.
Some of the most successful approaches to extractive
summarization consist of supervised algorithms that
attempt to learn what makes a good summary by train-
ing on collections of summaries built for a relatively
large number of training documents, e.g. (Hirao et
al., 2002), (Teufel and Moens, 1997). However, the
price paid for the high performance of such super-
vised algorithms is their inability to easily adapt to
new languages or domains, as new training data are
required for each new type of data. TextRank (Mi-
halcea and Tarau, 2004), (Mihalcea, 2004) is specifi-
cally designed to address this problem, by using an ex-
tractive summarization technique that does not require
any training data or any language-specific knowledge
sources. TextRank can be effectively applied to the
summarization of documents in different languages
without any modifications of the algorithm and with-
out any requirements for additional data. Moreover,
results from experiments performed on standard data
sets have demonstrated that the performance of Text-
Rank is competitive with that of some of the best sum-
marization systems available today.
2 Extractive Summarization
Ranking algorithms, such as Kleinberg's HITS al-
gorithm (Kleinberg, 1999) or Google's PageRank
(Brin and Page, 1998) have been traditionally and suc-
cessfully used in Web-link analysis, social networks,
and more recently in text processing applications. In
short, a graph-based ranking algorithm is a way of de-
ciding on the importance of a vertex within a graph,
by taking into account global information recursively
computed from the entire graph, rather than relying
only on local vertex-specific information. The basic
idea implemented by the ranking model is that of vot-
ing or recommendation. When one vertex links to an-
other one, it is basically casting a vote for that other
vertex. The higher the number of votes that are cast
for a vertex, the higher the importance of the vertex.
These graph ranking algorithms are based on a
random walk model, where a walker takes random
steps on the graph, with the walk being modeled as a
Markov process - that is, the decision on what edge to
follow is solely based on the vertex where the walker
is currently located. Under certain conditions, this
Here’s what’s next.
This paper can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Paper.
Mihalcea, Rada, 1974-. Language Independent Extractive Summarization, paper, July 2005; [Stroudsburg, Pennsylvania]. (https://digital.library.unt.edu/ark:/67531/metadc30967/m1/1/: accessed June 26, 2019), University of North Texas Libraries, Digital Library, https://digital.library.unt.edu; crediting UNT College of Engineering.