Extrapolating Subjectivity Research to Other Languages Metadata

Metadata describes a digital item, providing (if known) such information as creator, publisher, contents, size, relationship to other resources, and more. Metadata may also contain "preservation" components that help us to maintain the integrity of digital files over time.

Title

  • Main Title Extrapolating Subjectivity Research to Other Languages

Creator

  • Author: Banea, Carmen
    Creator Type: Personal

Contributor

  • Chair: Mihalcea, Rada, 1974-
    Contributor Type: Personal
    Contributor Info: Major Professor
  • Committee Member: Wiebe, Janyce
    Contributor Type: Personal
  • Committee Member: Tarau, Paul
    Contributor Type: Personal
  • Committee Member: Chen, Jiangping
    Contributor Type: Personal

Publisher

  • Name: University of North Texas
    Place of Publication: Denton, Texas
    Additional Info: www.unt.edu

Date

  • Creation: 2013-05

Language

  • English

Description

  • Content Description: Socrates articulated it best, "Speak, so I may see you." Indeed, language represents an invisible probe into the mind. It is the medium through which we express our deepest thoughts, our aspirations, our views, our feelings, our inner reality. From the beginning of artificial intelligence, researchers have sought to impart human like understanding to machines. As much of our language represents a form of self expression, capturing thoughts, beliefs, evaluations, opinions, and emotions which are not available for scrutiny by an outside observer, in the field of natural language, research involving these aspects has crystallized under the name of subjectivity and sentiment analysis. While subjectivity classification labels text as either subjective or objective, sentiment classification further divides subjective text into either positive, negative or neutral. In this thesis, I investigate techniques of generating tools and resources for subjectivity analysis that do not rely on an existing natural language processing infrastructure in a given language. This constraint is motivated by the fact that the vast majority of human languages are scarce from an electronic point of view: they lack basic tools such as part-of-speech taggers, parsers, or basic resources such as electronic text, annotated corpora or lexica. This severely limits the implementation of techniques on par with those developed for English, and by applying methods that are lighter in the usage of text processing infrastructure, we are able to conduct multilingual subjectivity research in these languages as well. Since my aim is also to minimize the amount of manual work required to develop lexica or corpora in these languages, the techniques proposed employ a lever approach, where English often acts as the donor language (the fulcrum in a lever) and allows through a relatively minimal amount of effort to establish preliminary subjectivity research in a target language.

Subject

  • Keyword: Natural language processing
  • Keyword: subjectivity analysis
  • Keyword: multilingual subjectivity

Collection

  • Name: UNT Theses and Dissertations
    Code: UNTETD

Institution

  • Name: UNT Libraries
    Code: UNT

Rights

  • Rights Access: public
  • Rights Holder: Banea, Carmen
  • Rights License: copyright
  • Rights Statement: Copyright is held by the author, unless otherwise noted. All rights Reserved.

Resource Type

  • Thesis or Dissertation

Format

  • Text

Identifier

  • Archival Resource Key: ark:/67531/metadc271777

Degree

  • Academic Department: Department of Computer Science and Engineering
  • Degree Discipline: Computer Science and Engineering
  • Degree Level: Doctoral
  • Degree Name: Doctor of Philosophy
  • Degree Grantor: University of North Texas
  • Degree Publication Type: disse

Note

Back to Top of Screen