Date: December 2002
Creator: Mihalcea, Rada, 1974-
Description: This paper presents a novel approach for word sense disambiguation. The underlying algorithm has two main components: (1) pattern learning from available sense-tagged corpora (SemCor), from dictionary definitions (WordNet) and from a generated corpus (GenCor), and (2) instance based learning with automatic feature selection, when training data is available for a particular word. The ideas described in this paper were implemented in a system that achieved the best score during the SENSEVAL-2 evaluation exercise, for both English all words and English lexical sample tasks.
Contributing Partner: UNT College of Engineering