Extrapolating Subjectivity Research to Other Languages Metadata
Metadata describes a digital item, providing (if known) such information as creator, publisher, contents, size, relationship to other resources, and more. Metadata may also contain "preservation" components that help us to maintain the integrity of digital files over time.
Title
- Main Title Extrapolating Subjectivity Research to Other Languages
Creator
-
Author: Banea, CarmenCreator Type: Personal
Contributor
-
Chair: Mihalcea, Rada, 1974-Contributor Type: PersonalContributor Info: Major Professor
-
Committee Member: Wiebe, JanyceContributor Type: Personal
-
Committee Member: Tarau, PaulContributor Type: Personal
-
Committee Member: Chen, JiangpingContributor Type: Personal
Publisher
-
Name: University of North TexasPlace of Publication: Denton, TexasAdditional Info: www.unt.edu
Date
- Creation: 2013-05
Language
- English
Description
- Content Description: Socrates articulated it best, "Speak, so I may see you." Indeed, language represents an invisible probe into the mind. It is the medium through which we express our deepest thoughts, our aspirations, our views, our feelings, our inner reality. From the beginning of artificial intelligence, researchers have sought to impart human like understanding to machines. As much of our language represents a form of self expression, capturing thoughts, beliefs, evaluations, opinions, and emotions which are not available for scrutiny by an outside observer, in the field of natural language, research involving these aspects has crystallized under the name of subjectivity and sentiment analysis. While subjectivity classification labels text as either subjective or objective, sentiment classification further divides subjective text into either positive, negative or neutral. In this thesis, I investigate techniques of generating tools and resources for subjectivity analysis that do not rely on an existing natural language processing infrastructure in a given language. This constraint is motivated by the fact that the vast majority of human languages are scarce from an electronic point of view: they lack basic tools such as part-of-speech taggers, parsers, or basic resources such as electronic text, annotated corpora or lexica. This severely limits the implementation of techniques on par with those developed for English, and by applying methods that are lighter in the usage of text processing infrastructure, we are able to conduct multilingual subjectivity research in these languages as well. Since my aim is also to minimize the amount of manual work required to develop lexica or corpora in these languages, the techniques proposed employ a lever approach, where English often acts as the donor language (the fulcrum in a lever) and allows through a relatively minimal amount of effort to establish preliminary subjectivity research in a target language.
Subject
- Keyword: Natural language processing
- Keyword: subjectivity analysis
- Keyword: multilingual subjectivity
Collection
-
Name: UNT Theses and DissertationsCode: UNTETD
Institution
-
Name: UNT LibrariesCode: UNT
Rights
- Rights Access: public
- Rights Holder: Banea, Carmen
- Rights License: copyright
- Rights Statement: Copyright is held by the author, unless otherwise noted. All rights Reserved.
Resource Type
- Thesis or Dissertation
Format
- Text
Identifier
- Archival Resource Key: ark:/67531/metadc271777
Degree
- Academic Department: Department of Computer Science and Engineering
- Degree Discipline: Computer Science and Engineering
- Degree Level: Doctoral
- Degree Name: Doctor of Philosophy
- Degree Grantor: University of North Texas
- Degree Publication Type: disse