You limited your search to:

  Partner: UNT College of Engineering
 Resource Type: Paper
 Decade: 1990-1999
 Collection: UNT Scholarly Works
An Automatic Method for Generating Sense Tagged Corpora

An Automatic Method for Generating Sense Tagged Corpora

Date: 1999
Creator: Mihalcea, Rada & Moldovan, Dan
Description: This paper discusses an automatic method for generating sense tagged corpora. Abstract: The unavailability of very large corpora with semantically disambiguated words is a major limitation in text processing research. For example, statistical methods for word sense disambiguation of free text are known to achieve high accuracy results when large corpora are available to develop context rules, to train and test them. This article presents a novel approach to automatically generate arbitrarily large corpora for word senses. The method is based on (1) the information provided in WordNet, used to formulate queries consisting of synonyms or definitions of word senses, and (2) the information gathered from Internet using existing search engines. The method was tested on 120 word senses and a precision of 91% was observed.
Contributing Partner: UNT College of Engineering
Word Sense Disambiguation based on Semantic Density

Word Sense Disambiguation based on Semantic Density

Date: August 1998
Creator: Mihalcea, Rada & Moldovan, Dan
Description: This paper presents a Word Sense Disambiguation method based on the idea of semantic density between words. The disambiguation is done in the context of WordNet. The Internet is used as a raw corpora to provide statistical information for word associations. A metric is introduced and used to measure the semantic density and to rank all possible combinations of the senses of two words. This method provides a precision of 58% in indicating the correct sense for both words at the same time. The precision increases as we consider more choices: 70% for top two ranked and 73% for top three ranked.
Contributing Partner: UNT College of Engineering
A WordNet-Based Interface to Internet Search Engines

A WordNet-Based Interface to Internet Search Engines

Date: May 1998
Creator: Moldovan, Dan & Mihalcea, Rada
Description: This paper discusses a WordNet-based interface to Internet search engines. A vast amount of information is available on the Internet, and naturally, many information gathering tools have been developed. Several search engines with different characteristics, such as Alta Vista, Lycos, Infoseek, and others are available. However, the web information retrieval technology is still in its infancy, and there is need for considerable improvement. Some inherent difficulties are: (1) the web information is diverse and highly unstructured, (2) the size of information is large and it grows at an exponential rate, and (3) the current search engine technology is still rudimentary. While the first two issues are more profound and require long term solutions, it may be possible to develop software around the search engines to improve the quality of the information retrieved. In this paper the authors present a natural language interface system to a search engine and discuss some of the results obtained.
Contributing Partner: UNT College of Engineering
A Method for Word Sense Disambiguation of Unrestricted Text

A Method for Word Sense Disambiguation of Unrestricted Text

Date: June 1999
Creator: Mihalcea, Rada & Moldovan, Dan
Description: This paper discusses a method for word sense disambiguation of unrestricted text. Selecting the most appropriate sense for an ambiguous word in a sentence is a central problem in Natural Language Processing. In this paper, the authors present a method that attempts to disambiguate all the nouns, verbs, adverbs and adjectives in a text, using the senses provided in WordNet. The senses are ranked using two sources of information: (1) the Internet for gathering statistics for word-word co-occurrences and (2) WordNet for measuring the semantic density for a pair of words. The authors report an average accuracy of 80% for the first ranked sense, and 91% for the first two ranked senses. Extensions of this method for larger windows of more than two words are considered.
Contributing Partner: UNT College of Engineering
LASSO: A Tool for Surfing the Answer Net

LASSO: A Tool for Surfing the Answer Net

Date: November 1999
Creator: Moldovan, Dan I.; Harabagiu, Sanda M.; Paşca, Marius. 1974-; Mihalcea, Rada, 1974-; Goodrum, Richard A.; Gîrju, Corina R. et al
Description: This paper discusses LASSO, a tool for surfing the answer net. Abstract: This paper presents the architecture, operation and results obtained with the LASSO system developed in the Natural Language Processing Laboratory at SMU. The system relies on a combination of syntactic and semantic techniques, and lightweight abductive inference to find answers. The search for the answer is based on a novel form of indexing called paragraph indexing. A score of 55.5% for short answers and 64.5% for long answers was achieved.
Contributing Partner: UNT College of Engineering