An Automatic Method for Generating Sense Tagged Corpora

Description:

This paper discusses an automatic method for generating sense tagged corpora.

Creator(s):
Creation Date: 1999  
Partner(s):
UNT College of Engineering
Collection(s):
UNT Scholarly Works
Usage:
Total Uses: 46
Past 30 days: 0
Yesterday: 0
Creator (Author):
Mihalcea, Rada, 1974-

University of North Texas; Southern Methodist University

Creator (Author):
Moldovan, Dan I.

Southern Methodist University

Date(s):
  • Creation: 1999
Description:

This paper discusses an automatic method for generating sense tagged corpora.

Degree:
Note:

Copyright 1999 American Association for Artificial Intelligence (AAAI). All rights reserved. http://www.aaai.org

Note:

Abstract: The unavailability of very large corpora with semantically disambiguated words is a major limitation in text processing research. For example, statistical methods for word sense disambiguation of free text are known to achieve high accuracy results when large corpora are available to develop context rules, to train and test them. This article presents a novel approach to automatically generate arbitrarily large corpora for word senses. The method is based on (1) the information provided in WordNet, used to formulate queries consisting of synonyms or definitions of word senses, and (2) the information gathered from Internet using existing search engines. The method was tested on 120 word senses and a precision of 91% was observed.

Physical Description:

6 p.

Language(s):
Subject(s):
Keyword(s): sense tagged corpora | WordNet | natural language processing | word sense disambiguation
Source: Sixteenth National Conference on Artificial Intelligence, 1999, Orlando, Florida, United States
Contributor(s):
Partner:
UNT College of Engineering
Collection:
UNT Scholarly Works
Identifier:
  • ARK: ark:/67531/metadc83300
Resource Type: Paper
Format: Text
Rights:
Access: Public