Measuring Semantic Relatedness Using Salient Encyclopedic Concepts

Measuring Semantic Relatedness Using Salient Encyclopedic Concepts

Date: August 2011
Creator: Hassan, Samer
Description: While pragmatics, through its integration of situational awareness and real world relevant knowledge, offers a high level of analysis that is suitable for real interpretation of natural dialogue, semantics, on the other end, represents a lower yet more tractable and affordable linguistic level of analysis using current technologies. Generally, the understanding of semantic meaning in literature has revolved around the famous quote ``You shall know a word by the company it keeps''. In this thesis we investigate the role of context constituents in decoding the semantic meaning of the engulfing context; specifically we probe the role of salient concepts, defined as content-bearing expressions which afford encyclopedic definitions, as a suitable source of semantic clues to an unambiguous interpretation of context. Furthermore, we integrate this world knowledge in building a new and robust unsupervised semantic model and apply it to entail semantic relatedness between textual pairs, whether they are words, sentences or paragraphs. Moreover, we explore the abstraction of semantics across languages and utilize our findings into building a novel multi-lingual semantic relatedness model exploiting information acquired from various languages. We demonstrate the effectiveness and the superiority of our mono-lingual and multi-lingual models through a comprehensive set of evaluations on specialized ...
Contributing Partner: UNT Libraries
Learning to Identify Educational Materials

Learning to Identify Educational Materials

Date: 2009
Creator: Hassan, Samer & Mihalcea, Rada, 1974-
Description: This paper discusses learning to identify educational materials.
Contributing Partner: UNT College of Engineering
Cross-lingual Semantic Relatedness Using Encyclopedic Knowledge

Cross-lingual Semantic Relatedness Using Encyclopedic Knowledge

Date: August 2009
Creator: Hassan, Samer & Mihalcea, Rada, 1974-
Description: This paper discusses cross-lingual semantic relatedness using encyclopedic knowledge.
Contributing Partner: UNT College of Engineering
Using the Essence of Texts to Improve Document Classification

Using the Essence of Texts to Improve Document Classification

Date: September 2005
Creator: Mihalcea, Rada, 1974- & Hassan, Samer
Description: This article discusses using the essence of texts to improve document classification.
Contributing Partner: UNT College of Engineering
Random-Walk Term Weighting for Improved Text Classification

Random-Walk Term Weighting for Improved Text Classification

Date: September 2007
Creator: Hassan, Samer; Mihalcea, Rada, 1974- & Banea, Carmen
Description: This paper describes a new approach for estimating term weights in a document, and shows how the new weighting scheme can be used to improve the accuracy of a text classifier.
Contributing Partner: UNT College of Engineering
Multilingual Subjectivity Analysis Using Machine Translation

Multilingual Subjectivity Analysis Using Machine Translation

Date: October 2008
Creator: Banea, Carmen; Mihalcea, Rada, 1974-; Wiebe, Janyce M. & Hassan, Samer
Description: This paper discusses multilingual subjectivity analysis using machine translation.
Contributing Partner: UNT College of Engineering
Text Mining for Automatic Image Tagging

Text Mining for Automatic Image Tagging

Date: August 2010
Creator: Leong, Chee Wee; Mihalcea, Rada, 1974- & Hassan, Samer
Description: This paper introduces several extractive approaches for automatic image tagging, relying exclusively on information mined from texts. Through evaluations on two datasets, the authors show that their methods exceed competitive baselines by a large margin, and compare favorably with the state-of-the-art that uses both textual and image features.
Contributing Partner: UNT College of Engineering
UNT: SubFinder: Combining Knowledge Sources for Automatic Lexical Substitution

UNT: SubFinder: Combining Knowledge Sources for Automatic Lexical Substitution

Date: June 2007
Creator: Hassan, Samer; Csomai, Andras; Banea, Carmen; Sinha, Ravi & Mihalcea, Rada, 1974-
Description: This paper describes the University of North Texas SubFinder system. The system is able to provide the most likely set of substitutes for a word in a given context, by combining several techniques and knowledge sources. SubFinder has successfully participated in the best and out of ten (oot) tracks in the SEMEVAL lexical substitution task, consistently ranking in the first or second place.
Contributing Partner: UNT College of Engineering