You limited your search to:
Partner:
UNT College of Engineering
Decade:
2000-2009
Collection:
UNT Scholarly Works
Building a Sense Tagged Corpus with Open Mind Word Expert
Date: July 2002
Creator: Chklovski, Timothy & Mihalcea, Rada
Description: This paper discusses Open Mind Word Expert, an implemented active learning system for collecting word sense tagging from the general public over the Web. It is available at http://teach-computers.org. The authors expect the system to yield a large volume of high-quality training data at a much lower cost than the traditional method of hiring lexicographers. The authors thus propose a Senseval-3 lexical sample activity where the training data is collected via Open Mind Word Expert. If successful, the collection process can be extended to create the definitive corpus of word sense information.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc81389/
Answering complex, list and context questions with LCC's Question-Answering Server
Date: November 2001
Creator: Harabagiu, Sanda; Moldovan, Dan; Paşca, Marius; Surdeanu, Mihai; Mihalcea, Rada; Gîrju, Roxana et al
Description: Abstract: This paper presents the architecture of the Question-Answering server (QAS) developed at the Language Computer Corporation (LCC) and used in the TREC-10 evaluations. LCC's QAS™ extracts answers for (a) factual questions of variable degree of difficulty; (b) questions that expect lists of answers; and (c) questions posed in the context of previous questions and answers. One of the major novelties is the implementation of bridging inference mechanisms that guide the search for answers to complex questions. Additionally, LCC's QAS™ encodes an efficient way of modeling context via reference resolution. In TREC-10, this system generated an RAR of 0.58 on the main task and 0.78 on the context task.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc83297/
Co-training and Self-training for Word Sense Disambiguation
Date: May 2004
Creator: Mihalcea, Rada, 1974-
Description: This paper investigates the application of co-training and self-training to word sense disambiguation. Optimal and empirical parameter selection methods for co-training and self-training are investigated, with various degrees of error reduction. A new method that combines co-training with majority voting is introduced, with the effect of smoothing the bootstrapping learning curves, and improving the average performance.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc30955/
Classification of Attributes and Behavior in Risk Management Using Bayesian Networks
Date: March 2007
Creator: Dantu, Ram; Kolan, Prakash; Loper, Kall & Akl, Robert G.
Description: This paper discusses issues in security. Abstract: Security administration is an uphill task to implement in an enterprise network providing secured corporate services. With the slew of patches being released by network component vendors, system administrators require a barrage of tools for analyzing the risk due to vulnerabilities in those components. In addition, criticalities in patching some end hosts raises serious security issues about the network to which the end hosts are connected. In this context, it would be imperative to know the risk level of all critical resources keeping in view the everyday emerging new vulnerabilities. The authors hypothesize that sequence of network actions by attackers depends on their social and attack profile (behavioral resources such as skill level, time, and attitude). To estimate the types of attack behavior, the athors surveyed individuals for their ability and attack intent. Using the individuals' responses, the authors determined their behavioral resources and classified them as having opportunist, hacker, or explorer behavior. The profile behavioral resources can be used for determining risk by an attacker having that profile. Thus, suitable vulnerability analysis and risk management strategies can be formulated to efficiently curtail the risk from different types of attackers.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc30836/
Combining Lexical Resources for Contextual Synonym Expansion
Date: 2009
Creator: Sinha, Ravi & Mihalcea, Rada, 1974-
Description: This paper discusses combining lexical resources for contextual synonym expansion. Abstract: In this paper, we experiment with the task of contextual synonym expansion, and compare the benefits of combining multiple lexical resources using both unsupervised and supervised approaches. Overall, the results obtained through the combination of several resources exceed the current state-of-the-art when selecting the best synonym for a given target word, and place second when selecting the top ten synonyms, thus demonstrating the usefulness of the approach.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc31011/
Creating a Testbed for the Evaluation of Automatically Generated Back-of-the-book Indexes
Date: February 2006
Creator: Csomai, Andras & Mihalcea, Rada, 1974-
Description: This paper discusses automatic generating of back-of-the-book indexes. Abstract: The automatic generation of back-of-the-book indexes seems to be out of sight of the Information Retrieval and Natural Language Processing communities, although the increasingly large number of books available in electronic format, as well as recent advances in key-phrase extraction, should motivate an increased interest in this topic. In this paper, the authors describe the background relevant to the process of creating back-of-the-book indexes, namely (1) a short overview of the origin and structure of back-of-the-book indexes, and (2) the correspondence that can be established between techniques for automatic index construction and keyphrase extraction. Since the development of any automatic system requires in the first place an evaluation testbed, the authors describe their work in building a gold standard collection of books and indexes, and the authors present several metrics that can be used for the evaluation of automatically generated indexes against the gold standard. Finally, the authors investigate the properties of the gold standard index, such as index size, length of index entries, and upper bounds on coverage as indicated by the presence of index entries in the document.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc30982/
Creating Large Annotated Data Collections with Web Users' Help
Date: April 2003
Creator: Mihalcea, Rada, 1974- & Chklovski, Timothy A. (Timothy Anatolievich), 1977
Description: This paper discusses creating annotated data collections. Abstract: Open Mind Word Expert is an implemented active learning system that aims to create large annotated corpora by tapping into the world's vast pool of knowledge. It does this by relying on the vast number of Web users who contribute their knowledge to data annotation. Open Mind Word Expert focuses on building semantically annotated corpora, by collecting word sense tagging from the general public over the Web. During the first nine months of activity, the system yielded 90,000 high quality tagged items at a much lower cost than the traditional method of hiring lexicographers.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc30949/
Attracting and Retaining Women in Computer Science and Engineering: Evaluating the Results
Date: June 2007
Creator: Keathly, David & Akl, Robert G.
Description: This paper discusses efforts to attract and retain students in computer science and engineering fields. Abstract: Computer science and engineering communities have been exploring a variety of activities and techniques to attract and retain more students, especially women and minorities, to computer science and computer engineering degree programs. This paper briefly describes the efforts and results of a plan for actively recruiting young women into undergraduate computer engineering and computer science programs hosted by the University of North Texas (UNT). It also describes a series of activities aimed at improving the retention rate of students already in our programs, particularly during the freshman year. Such recruitment and retention efforts are critical to the country's efforts to increase the number of engineering professionals, and are a priority for the Computer Science and Engineering (CSE) Department at UNT.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc30834/
Automatic generation of a coarse grained WordNet
Date: June 2001
Creator: Mihalcea, Rada & Moldovan, Dan
Description: This paper discusses automatic generation of a coarse grained WordNet. Abstract: Several principles for the automatic transformation of WordNet into a coarser grained dictionary are proposed. A new version of WordNet is derived, leading to a reduction of 26% in the average polysemy of words, while introducing a small error rate of 2.1%, as measured on a sense tagged corpus.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc83310/
Automatic Keyword Extraction for Learning Object Repositories
Date: October 2008
Creator: Coursey, Kino High; Mihalcea, Rada & Moen, William E.
Description: Abstract: This paper describes experiments in metadata generation for learning object repositories. Specifically, the authors present several methods for automatic keyword extraction and evaluate them on a collection of learning objects from an undergraduate history course. The results suggest that automatic keyword extraction is a viable solution for suggesting terms and phrases for metadata annotation.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc31003/