Extrapolating Subjectivity Research to Other Languages

Banea, Carmen

Extrapolating Subjectivity Research to Other Languages Metadata

Metadata describes a digital item, providing (if known) such information as creator, publisher, contents, size, relationship to other resources, and more. Metadata may also contain "preservation" components that help us to maintain the integrity of digital files over time.

Available Formats

Dublin Core: XML Dublin Core: JSON Dublin Core: Text Dublin Core: RDF/XML UNTL: XML Access METS: XML

Title

Main Title Extrapolating Subjectivity Research to Other Languages

Creator

Author: Banea, Carmen

Creator Type: Personal

Contributor

Chair: Mihalcea, Rada, 1974-

Contributor Type: Personal

Contributor Info: Major Professor
Committee Member: Wiebe, Janyce

Contributor Type: Personal
Committee Member: Tarau, Paul

Contributor Type: Personal
Committee Member: Chen, Jiangping

Contributor Type: Personal

Publisher

Name: University of North Texas

Place of Publication: Denton, Texas

Additional Info: www.unt.edu

Date

Creation: 2013-05

Language

English

Description

Content Description: Socrates articulated it best, "Speak, so I may see you." Indeed, language represents an invisible probe into the mind. It is the medium through which we express our deepest thoughts, our aspirations, our views, our feelings, our inner reality. From the beginning of artificial intelligence, researchers have sought to impart human like understanding to machines. As much of our language represents a form of self expression, capturing thoughts, beliefs, evaluations, opinions, and emotions which are not available for scrutiny by an outside observer, in the field of natural language, research involving these aspects has crystallized under the name of subjectivity and sentiment analysis. While subjectivity classification labels text as either subjective or objective, sentiment classification further divides subjective text into either positive, negative or neutral. In this thesis, I investigate techniques of generating tools and resources for subjectivity analysis that do not rely on an existing natural language processing infrastructure in a given language. This constraint is motivated by the fact that the vast majority of human languages are scarce from an electronic point of view: they lack basic tools such as part-of-speech taggers, parsers, or basic resources such as electronic text, annotated corpora or lexica. This severely limits the implementation of techniques on par with those developed for English, and by applying methods that are lighter in the usage of text processing infrastructure, we are able to conduct multilingual subjectivity research in these languages as well. Since my aim is also to minimize the amount of manual work required to develop lexica or corpora in these languages, the techniques proposed employ a lever approach, where English often acts as the donor language (the fulcrum in a lever) and allows through a relatively minimal amount of effort to establish preliminary subjectivity research in a target language.

Subject

Keyword: Natural language processing
Keyword: subjectivity analysis
Keyword: multilingual subjectivity

Collection

Name: UNT Theses and Dissertations

Code: UNTETD

Institution

Name: UNT Libraries

Code: UNT

Rights

Rights Access: public
Rights Holder: Banea, Carmen
Rights License: copyright

Resource Type

Thesis or Dissertation

Format

Text

Identifier

Archival Resource Key: ark:/67531/metadc271777

Degree

Academic Department: Department of Computer Science and Engineering
Degree Discipline: Computer Science and Engineering
Degree Level: Doctoral
Degree Name: Doctor of Philosophy
Degree Grantor: University of North Texas
Degree Publication Type: disse