You limited your search to:
Partner:
UNT College of Engineering
Collection:
UNT Scholarly Works
Corpus-based and Knowledge-based Measures of Text Semantic Similarity
Date: July 2006
Creator: Mihalcea, Rada, 1974-; Corley, Courtney & Strapparava, Carlo, 1962-
Description: Abstract: This paper presents a method for measuring the semantic similarity of texts, using corpus-based and knowledge-based measures of similarity. Previous work on this problem has focused mainly on either large documents (e.g. text classification, information retrieval) or individual words (e.g. synonymy tests). Given that a large fraction of the information available today, on the Web and elsewhere, consists of short text snippets (e.g. abstracts of scientific documents, image captions, product descriptions), in this paper the authors focus on measuring the semantic similarity of short texts. Through experiments performed on a paraphrase data set, the authors show that the semantic similarity method out-performs methods based on simple lexical matching, resulting in up to 13% error rate reduction with respect to the traditional vector-based similarity metric.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc30981/
A Corpus-based Approach to Finding Happiness
Date: March 2006
Creator: Liu, Hugo & Mihalcea, Rada, 1974-
Description: This paper discusses how to locate emotions. Abstract: What are the sources of happiness and sadness in everyday life? In this paper, the authors employ 'linguistic ethnography' to seek out where happiness lies in our everyday lives by considering a corpus of blogposts from the LiveJournal community annotated with happy and sad moods. By analyzing this corpus, the authors derive lists of happy and sad words and phrases annotated by their 'happiness factor'. Various semantic analyses performed with this wordlist reveal the happiness trajectory of a 24-day (3am and 9-10p are most happy), and a 7-day week (Wednesdays are saddest), and compare the socialness and human-centeredness of happy descriptions versus sad descriptions. The authors evaluate our corpus-based approach in a classification task and contrast our wordlist with emotionally-annotated wordlists produced by experimental focus groups. Having located happiness temporally and semantically within this corpus of everyday life, the paper concludes by offering a corpus-inspired livable recipe for happiness.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc30980/
Creating a Testbed for the Evaluation of Automatically Generated Back-of-the-book Indexes
Date: February 2006
Creator: Csomai, Andras & Mihalcea, Rada, 1974-
Description: This paper discusses automatic generating of back-of-the-book indexes. Abstract: The automatic generation of back-of-the-book indexes seems to be out of sight of the Information Retrieval and Natural Language Processing communities, although the increasingly large number of books available in electronic format, as well as recent advances in key-phrase extraction, should motivate an increased interest in this topic. In this paper, the authors describe the background relevant to the process of creating back-of-the-book indexes, namely (1) a short overview of the origin and structure of back-of-the-book indexes, and (2) the correspondence that can be established between techniques for automatic index construction and keyphrase extraction. Since the development of any automatic system requires in the first place an evaluation testbed, the authors describe their work in building a gold standard collection of books and indexes, and the authors present several metrics that can be used for the evaluation of automatically generated indexes against the gold standard. Finally, the authors investigate the properties of the gold standard index, such as index size, length of index entries, and upper bounds on coverage as indicated by the presence of index entries in the document.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc30982/
Creating Large Annotated Data Collections with Web Users' Help
Date: April 2003
Creator: Mihalcea, Rada, 1974- & Chklovski, Timothy A. (Timothy Anatolievich), 1977
Description: This paper discusses creating annotated data collections. Abstract: Open Mind Word Expert is an implemented active learning system that aims to create large annotated corpora by tapping into the world's vast pool of knowledge. It does this by relying on the vast number of Web users who contribute their knowledge to data annotation. Open Mind Word Expert focuses on building semantically annotated corpora, by collecting word sense tagging from the general public over the Web. During the first nine months of activity, the system yielded 90,000 high quality tagged items at a much lower cost than the traditional method of hiring lexicographers.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc30949/
Cross-lingual Semantic Relatedness Using Encyclopedic Knowledge
Date: August 2009
Creator: Hassan, Samer & Mihalcea, Rada, 1974-
Description: This paper discusses cross-lingual semantic relatedness using encyclopedic knowledge. Abstract: In this paper, we address the task of cross-lingual semantic relatedness. We introduce a method that relies on the information extracted from Wikipedia, by exploiting the interlanguage links available between Wikipedia versions in multiple languages. Through experiments performed on several language pairs, we show that the method performs well, with a performance comparable to monolingual measures of relatedness.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc31012/
Current Research in Wireless at UNT
Date: October 2004
Creator: Akl, Robert G.
Description: This presentation discusses wireless networks, access point selections, traffic balancing, multi-cell CDMA, user distribution modeling, and call admission control.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc30930/
The Decomposition of Human-Written Book Summaries
Date: March 2009
Creator: Ceylan, Hakan & Mihalcea, Rada, 1974-
Description: In this paper, the authors evaluate the extent to which human-written book summaries can be obtained through cut-and-paste operations from the original book. The authors analyze the effect of the parameters involved in the decomposition algorithm, and highlight the distinctions in coverage obtained for different summary types.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc31018/
Document Indexing using Named Entities
Date: January 2001
Creator: Mihalcea, Rada, 1974- & Moldovan, Dan I.
Description: This article discusses document indexing using named entities. Abstract: Current text indexing and retrieval techniques have their roots in the field of Information Retrieval where the task is to extract documents that best match a query. With an ever increasing number of documents available due to the easy access through the Internet, the challenge is to provide users with concise and relevant information. The authors are proposing here a novel, yet simple approach, which indexes the named entities in the documents, such as to improve the relevance of documents retrieved. Experiments performed in finding information related to a set of 75 input questions, from a large collection of 125,000 documents, show that this new technique reduces the number of retrieved documents by a factor of 2, while still retrieving the relevant documents.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc83311/
Dynamic Agent Population in Agent-Based Distance Vector Routing
Date: August 2002
Creator: Amin, Kaizar A. & Mikler, Armin R.
Description: This paper discusses dynamic agent population in agent-based distance vector routing. Abstract: The Intelligent mobile agent paradigm can be applied to a wide variety of intrinsically parallel and distributed applications. Network routing is one such application that can be mapped to an agent-based approach. The performance of any agent-based system will depend on its agent population. Although a lot of research has been conducted on agent-based systems, little consideration has been given to the importance of agent population in dynamic networks. A large number of constituent agents can increase the resource overhead of the system, thereby impeding the overall performance of the network. Hence, it is imperative to find the optimal number of agents in the system that would maximize the efficiency of the agent-based mechanism in the network. This optimal value cannot be determined manually, thereby emphasizing the need for an adaptive approach that manipulates the number of agents in the system based on its resource availability. This paper discusses an agent-based approach to Distance Vector Routing, referred as Agent-based Distance Vector Routing and also describes an adaptive approach controlling the number of agents in the network using pheromones and discusses their limitations.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc132968/
Dynamic Channel Assignment in IEEE 802.11 Networks
Date: March 2007
Creator: Akl, Robert G. & Arepally, Anurag
Description: This paper discusses dynamic channel assignment in IEEE 802.11 networks. Abstract: We design a dynamic channel assignment algorithm for IEEE 802.11 wireless networks. Our algorithm assigns channels dynamically in a way that minimizes channel interference generated by neighboring access points (APs) on a reference access point, resulting in higher throughput. We implement and simulate their algorithm using two versions (1: pick randomly 2: pick first) and different number of APs (4, 9, 16, and 25). Analysis of this algorithm shows an improvement by a factor of 4 (by lowering the total interference on an AP by 6 dBm on average) over default settings of having all APs use the same channel. As the number of APs is increased in a given service area, dynamic channel assignment becomes crucial; otherwise overlapping channel interference becomes a limiting factor.
Contributing Partner: UNT College of Engineering
Permallink:digital.library.unt.edu/ark:/67531/metadc30837/