You limited your search to:
Partner:
UNT College of Engineering
Answering complex, list and context questions with LCC's Question-Answering Server
Date: November 2001
Creator: Harabagiu, Sanda M.; Moldovan, Dan I.; Paşca, Marius. 1974-; Surdeanu, Mihai; Mihalcea, Rada, 1974-; Gîrju, Roxana et al
Description: Abstract: This paper presents the architecture of the Question-Answering server (QAS) developed at the Language Computer Corporation (LCC) and used in the TREC-10 evaluations. LCC's QAS™ extracts answers for (a) factual questions of variable degree of difficulty; (b) questions that expect lists of answers; and (c) questions posed in the context of previous questions and answers. One of the major novelties is the implementation of bridging inference mechanisms that guide the search for answers to complex questions. Additionally, LCC's QAS™ encodes an efficient way of modeling context via reference resolution. In TREC-10, this system generated an RAR of 0.58 on the main task and 0.78 on the context task.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc83297/
The Structure and Performance of an Open-Domain Question Answering System
Date: October 2000
Creator: Moldovan, Dan I.; Harabagiu, Sanda M.; Paşca, Marius. 1974-; Mihalcea, Rada, 1974-; Gîrju, Roxana; Goodrum, Richard A. et al
Description: This paper presents the architecture, operation and results obtained with the LASSO Question Answering system developed in the Natural Language Processing Laboratory at SMU. To find answers, the system relies on a combination of syntactic and semantic techniques. The search for the answer is based on a novel form of indexing called paragraph indexing. A score of 55.5% for short answers and 64.5% for long answers was achieved at the TREC-8 competition.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc83312/
A Semi-Complete Disambiguation Algorithm for Open Text
Date: 2000
Creator: Mihalcea, Rada, 1974-
Description: This paper discusses a semi-complete disambiguation algorithm for open text. Word Sense Disambiguation (WSD) is one of the most difficult areas of Natural Language Processing (NLP); the semantic comprehension of a text, and the possibility to expand a text with semantically related information, drastically depends on the availability of a highly accurate WSD algorithm. Solutions considered so far by researchers for the WSD problem, are making use of machine readable dictionaries (Leacock, Chodorow and Miller 1998), or the information gathered from raw or semantically disambiguated corpora (Yarowsky 1995). These methods are designed either to work with a few pre-selected words, in which case a high accuracy is obtained, or they are general methods which disambiguate, with lower precision, all the words in a text. With the present work, the authors are trying to achieve a compromise between these two different directions. There are fields in NLP, like Information Retrieval and others, which could benefit from a method which performs a semi-complete disambiguation (i.e. it disambiguates only a certain percentage of the words in a text), but which is highly accurate.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc83293/
The Semantic Wildcard
Date: May 2002
Creator: Mihalcea, Rada, 1974-
Description: This paper introduces the semantic wildcard. The IRSLO (Information Retrieval using Semantic and Lexical Operators) project aims at integrating semantic and lexical information into the retrieval process, in order to overcome some of the impediments currently encountered with today's information retrieval systems. This paper introduces the semantic wildcard, one of the most powerful operators implemented in IRSLO, which allows for searches along general-specific lines. The semantic wildcard, denoted with #, acts in a manner similar with the lexical wildcard, but at semantic levels, enabling the retrieval of subsumed concepts. For instance, a search for animal# will match any concept that is of type animal, including dog, goat, and so forth, thereby going beyond the explicit knowledge stated in texts. This operator, together with a lexical locality operator that enables the retrieval of paragraphs rather than entire documents, have been both implemented in the IRSLO system and tested on requests of information run against an index of 130,000 documents. Significant improvement was observed over classic keyword-based retrieval systems in terms of precision, recall and success rate.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc83308/
Automatic generation of a coarse grained WordNet
Date: June 2001
Creator: Mihalcea, Rada, 1974- & Moldovan, Dan I.
Description: This paper discusses automatic generation of a coarse grained WordNet. Abstract: Several principles for the automatic transformation of WordNet into a coarser grained dictionary are proposed. A new version of WordNet is derived, leading to a reduction of 26% in the average polysemy of words, while introducing a small error rate of 2.1%, as measured on a sense tagged corpus.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc83310/
Word Sense Disambiguation based on Semantic Density
Date: August 1998
Creator: Mihalcea, Rada, 1974- & Moldovan, Dan I.
Description: This paper presents a Word Sense Disambiguation method based on the idea of semantic density between words. The disambiguation is done in the context of WordNet. The Internet is used as a raw corpora to provide statistical information for word associations. A metric is introduced and used to measure the semantic density and to rank all possible combinations of the senses of two words. This method provides a precision of 58% in indicating the correct sense for both words at the same time. The precision increases as we consider more choices: 70% for top two ranked and 73% for top three ranked.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc83303/
Semantic Indexing using WordNet Senses
Date: October 2000
Creator: Mihalcea, Rada, 1974- & Moldovan, Dan I.
Description: In this paperarticle, the authors describe a boolean Information Retrieval system that adds words semantics to the classic word based indexing. Two of the main tasks of our system, namely the indexing and retrieval components, are using a combined word-based and sense-based approach. The key to our system is a methodology for building semantic representations of open text, at word and collocation level. This new technique, called semantic indexing, shows improved effectiveness over the classic word based indexing techniques.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc83301/
A WordNet-Based Interface to Internet Search Engines
Date: May 1998
Creator: Moldovan, Dan I. & Mihalcea, Rada, 1974-
Description: This paper discusses a WordNet-based interface to Internet search engines. A vast amount of information is available on the Internet, and naturally, many information gathering tools have been developed. Several search engines with different characteristics, such as Alta Vista, Lycos, Infoseek, and others are available. However, the web information retrieval technology is still in its infancy, and there is need for considerable improvement. Some inherent difficulties are: (1) the web information is diverse and highly unstructured, (2) the size of information is large and it grows at an exponential rate, and (3) the current search engine technology is still rudimentary. While the first two issues are more profound and require long term solutions, it may be possible to develop software around the search engines to improve the quality of the information retrieved. In this paper the authors present a natural language interface system to a search engine and discuss some of the results obtained.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc83305/
An Automatic Method for Generating Sense Tagged Corpora
Date: 1999
Creator: Mihalcea, Rada, 1974- & Moldovan, Dan I.
Description: This paper discusses an automatic method for generating sense tagged corpora. Abstract: The unavailability of very large corpora with semantically disambiguated words is a major limitation in text processing research. For example, statistical methods for word sense disambiguation of free text are known to achieve high accuracy results when large corpora are available to develop context rules, to train and test them. This article presents a novel approach to automatically generate arbitrarily large corpora for word senses. The method is based on (1) the information provided in WordNet, used to formulate queries consisting of synonyms or definitions of word senses, and (2) the information gathered from Internet using existing search engines. The method was tested on 120 word senses and a precision of 91% was observed.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc83300/
Flexible Allocation of Capacity in Multi-Cell CDMA Networks
Date: July 1999
Creator: Akl, Robert G.; Hegde, Manju V.; Naraghi-Pour, Mort & Min, Paul S.
Description: This presentation discusses flexible allocation of capacity in multi-cell CDMA networks. The effect of reverse power levels on the capacity of a code-division multiple-access (CDMA) cellular network is evaluated. The inter-cell and intra-cell interferences of every cell on every other cell are first calculated for a given network topology. Based on this, the nominal power of users is increased by a factor the authors call the Power Compensation Factor (PCF) which enables small cells to overcome the excessive interference from adjacent large cells.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc81377/