You limited your search to:

 Resource Type: Paper
 Decade: 1990-1999
 Language: English
 Collection: UNT Scholarly Works
LASSO: A Tool for Surfing the Answer Net

LASSO: A Tool for Surfing the Answer Net

Date: November 1999
Creator: Moldovan, Dan I.; Harabagiu, Sanda M.; Paşca, Marius. 1974-; Mihalcea, Rada, 1974-; Goodrum, Richard A.; Gîrju, Corina R. et al
Description: This paper discusses LASSO, a tool for surfing the answer net. Abstract: This paper presents the architecture, operation and results obtained with the LASSO system developed in the Natural Language Processing Laboratory at SMU. The system relies on a combination of syntactic and semantic techniques, and lightweight abductive inference to find answers. The search for the answer is based on a novel form of indexing called paragraph indexing. A score of 55.5% for short answers and 64.5% for long answers was achieved.
Contributing Partner: UNT College of Engineering
A Method for Word Sense Disambiguation of Unrestricted Text

A Method for Word Sense Disambiguation of Unrestricted Text

Date: June 1999
Creator: Mihalcea, Rada, 1974- & Moldovan, Dan I.
Description: This paper discusses a method for word sense disambiguation of unrestricted text. Abstract: Selecting the most appropriate sense for an ambiguous word in a sentence is a central problem in Natural Language Processing. In this paper, the authors present a method that attempts to disambiguate all the nouns, verbs, adverbs and adjectives in a text, using the senses provided in WordNet. The senses are ranked using two sources of information: (1) the Internet for gathering statistics for word-word co-occurrences and (2) WordNet for measuring the semantic density for a pair of words. The authors report an average accuracy of 80% for the first ranked sense, and 91% for the first two ranked senses. Extensions of this method for larger windows of more than two words are considered.
Contributing Partner: UNT College of Engineering
An Automatic Method for Generating Sense Tagged Corpora

An Automatic Method for Generating Sense Tagged Corpora

Date: 1999
Creator: Mihalcea, Rada & Moldovan, Dan
Description: This paper discusses an automatic method for generating sense tagged corpora. Abstract: The unavailability of very large corpora with semantically disambiguated words is a major limitation in text processing research. For example, statistical methods for word sense disambiguation of free text are known to achieve high accuracy results when large corpora are available to develop context rules, to train and test them. This article presents a novel approach to automatically generate arbitrarily large corpora for word senses. The method is based on (1) the information provided in WordNet, used to formulate queries consisting of synonyms or definitions of word senses, and (2) the information gathered from Internet using existing search engines. The method was tested on 120 word senses and a precision of 91% was observed.
Contributing Partner: UNT College of Engineering
Word Sense Disambiguation based on Semantic Density

Word Sense Disambiguation based on Semantic Density

Date: August 1998
Creator: Mihalcea, Rada & Moldovan, Dan
Description: This paper presents a Word Sense Disambiguation method based on the idea of semantic density between words. The disambiguation is done in the context of WordNet. The Internet is used as a raw corpora to provide statistical information for word associations. A metric is introduced and used to measure the semantic density and to rank all possible combinations of the senses of two words. This method provides a precision of 58% in indicating the correct sense for both words at the same time. The precision increases as we consider more choices: 70% for top two ranked and 73% for top three ranked.
Contributing Partner: UNT College of Engineering
A WordNet-Based Interface to Internet Search Engines

A WordNet-Based Interface to Internet Search Engines

Date: May 1998
Creator: Moldovan, Dan & Mihalcea, Rada
Description: This paper discusses a WordNet-based interface to Internet search engines. A vast amount of information is available on the Internet, and naturally, many information gathering tools have been developed. Several search engines with different characteristics, such as Alta Vista, Lycos, Infoseek, and others are available. However, the web information retrieval technology is still in its infancy, and there is need for considerable improvement. Some inherent difficulties are: (1) the web information is diverse and highly unstructured, (2) the size of information is large and it grows at an exponential rate, and (3) the current search engine technology is still rudimentary. While the first two issues are more profound and require long term solutions, it may be possible to develop software around the search engines to improve the quality of the information retrieved. In this paper the authors present a natural language interface system to a search engine and discuss some of the results obtained.
Contributing Partner: UNT College of Engineering
CIMI's Z39.50 Interoperability Testbed: Search and Retrieval of Distributed Cultural Heritage Information

CIMI's Z39.50 Interoperability Testbed: Search and Retrieval of Distributed Cultural Heritage Information

Date: January 2, 1998
Creator: Moen, William E.
Description: This paper discusses the Consortium for the Computer Interchange of Museum Information (CIMI)'s international effort to provide distributed search and retrieval of cultural heritage information. A primary aspect of CIMI's work utilizes ANSI/NISO Z39.50-1995, and American National Standard protocol for information retrieval. The International Organization for Standardization (ISO) recently approved Z39.50 as ISO 23950. CIMI chose Z39.50 to enable uniform access to existing and emerging digital collections and the vast repositories of cultural heritage information resources. These resources include a variety of physical and digital objects--physical artifacts and digital derivatives of those artifacts, descriptive records designed for collection management, bibliographic records, full-text documents, online tools such as thesauri and authoritative lists of artists' names, and more. CIMI's application Z39.50 in the networked cultural heritage information environment is breaking new ground in distributed and integrated access to textual and non-textual digital collections.
Contributing Partner: UNT College of Information
The Role of Content Analysis in Evaluating Metadata for the U.S. Government Information Locator Service (GILS): Results from an Exploratory Study

The Role of Content Analysis in Evaluating Metadata for the U.S. Government Information Locator Service (GILS): Results from an Exploratory Study

Date: 1997
Creator: Moen, William E.; Stewart, Erin L. & McClure, Charles R.
Description: This paper discusses application of qualitative and quantitative content analysis techniques to assess metadata records from 42 Federal agencies' implementation of the Government Information Locator Service (GILS). GILS databases respond to a late-1994 initiative to "identify public information resources throughout the Federal government, describe the information available in those resources, and provide assistance in obtaining the information [and] serve as a tool to improve agency electronic records management practices". GILS metadata records describe agencies' automated information systems, Privacy Act systems of records, and locators that cover its information dissemination products. The authors used record content analysis, and several other methods, to examine whether GILS is meeting user expectations. Criteria used in the current analysis were informed in part by results of user and service-implementor questionnaires and focus groups. The record content analysis itself, in turn, informed creation of a scripted online assessment for users, and data from that user assessment supplemented results of the content analysis. The quality of metadata for networked resources is as of yet a relatively unexplored research area. At this point, no consensus has been reached on operational and conceptual definitions of quality; likewise, validated procedures for assessing metadata are lacking. On the basis of the ...
Contributing Partner: UNT College of Information