Date: June 2010
Creator: Akkaya, Cem; Conrad, Alexander; Wiebe, Janyce & Mihalcea, Rada
Description: This paper discusses word sense disambiguation. Abstract: Amazon Mechanical Turk (MTurk) is a marketplace for so-called "human intelligence tasks" (HITs), or tasks that are easy for humans but currently difficult for automated processes. Providers upload tasks to MTurk which workers then complete. Natural language annotation is one such human intelligence task. In this paper, the authors investigate using MTurk to collect annotations for Subjectivity Word Sense Disambiguation (SWSD), a course-grained word sense disambiguation task. The authors investigate whether they can use MTurk to acquire good annotations with respect to gold-standard data, whether they can filter out low-quality workers (spammers), and whether there is a learning effect associated with repeatedly completing the same kind of task. While our results with respect to spammers are inconclusive, the authors are able to obtain high-quality annotations for the SWSD task. These results suggest a greater role for MTurk with respect to constructing a large scale SWSD system in the future, promising substantial improvement in subjectivity and sentiment analysis.
Contributing Partner: UNT College of Engineering