Corpus-based and Knowledge-based Measures of Text Semantic Similarity

PDF Version Also Available for Download.

Description

This article discusses corpus-based and knowledge-based measures of text semantic similarity.

Physical Description

6 p.

Creation Information

Mihalcea, Rada, 1974-; Corley, Courtney & Strapparava, Carlo, 1962- July 2006.

Context

This paper is part of the collection entitled: UNT Scholarly Works and was provided by UNT College of Engineering to Digital Library, a digital repository hosted by the UNT Libraries. It has been viewed 2404 times , with 61 in the last month . More information about this paper can be viewed below.

Who

People and organizations associated with either the creation of this paper or its content.

Authors

Provided By

UNT College of Engineering

The UNT College of Engineering promotes intellectual and scholarly pursuits in the areas of computer science and engineering, preparing innovative leaders in a variety of disciplines. The UNT College of Engineering encourages faculty and students to pursue interdisciplinary research among numerous subjects of study including databases, numerical analysis, game programming, and computer systems architecture.

Contact Us

What

Descriptive information to help identify this paper. Follow the links below to find similar items on the Digital Library.

Degree Information

Description

This article discusses corpus-based and knowledge-based measures of text semantic similarity.

Physical Description

6 p.

Notes

Copyright 2006 American Association for Artificial Intelligence (AAAI). All rights reserved. http://www.aaai.org

Abstract: This paper presents a method for measuring the semantic similarity of texts, using corpus-based and knowledge-based measures of similarity. Previous work on this problem has focused mainly on either large documents (e.g. text classification, information retrieval) or individual words (e.g. synonymy tests). Given that a large fraction of the information available today, on the Web and elsewhere, consists of short text snippets (e.g. abstracts of scientific documents, image captions, product descriptions), in this paper the authors focus on measuring the semantic similarity of short texts. Through experiments performed on a paraphrase data set, the authors show that the semantic similarity method out-performs methods based on simple lexical matching, resulting in up to 13% error rate reduction with respect to the traditional vector-based similarity metric.

Source

  • American Association for Artificial Intelligence (AAAI) Conference, 2006, Boston, Massachusetts, United States

Language

Item Type

Collections

This paper is part of the following collection of related materials.

UNT Scholarly Works

Materials from the UNT community's research, creative, and scholarly activities and UNT's Open Access Repository. Access to some items in this collection may be restricted.

What responsibilities do I have when using this paper?

When

Dates and time periods associated with this paper.

Creation Date

  • July 2006

Added to The UNT Digital Library

  • Jan. 31, 2011, 2:01 p.m.

Description Last Updated

  • March 27, 2014, 11:42 a.m.

Usage Statistics

When was this paper last used?

Yesterday: 2
Past 30 days: 61
Total Uses: 2,404

Interact With This Paper

Here are some suggestions for what to do next.

Start Reading

PDF Version Also Available for Download.

Citations, Rights, Re-Use

Mihalcea, Rada, 1974-; Corley, Courtney & Strapparava, Carlo, 1962-. Corpus-based and Knowledge-based Measures of Text Semantic Similarity, paper, July 2006; (digital.library.unt.edu/ark:/67531/metadc30981/: accessed June 26, 2017), University of North Texas Libraries, Digital Library, digital.library.unt.edu; crediting UNT College of Engineering.