A Framework of Automatic Subject Term Assignment: An Indexing Conception-Based Approach

PDF Version Also Available for Download.

Description

The purpose of dissertation is to examine whether the understandings of subject indexing processes conducted by human indexers have a positive impact on the effectiveness of automatic subject term assignment through text categorization (TC). More specifically, human indexers' subject indexing approaches or conceptions in conjunction with semantic sources were explored in the context of a typical scientific journal article data set. Based on the premise that subject indexing approaches or conceptions with semantic sources are important for automatic subject term assignment through TC, this study proposed an indexing conception-based framework. For the purpose of this study, three hypotheses were tested: ... continued below

Creation Information

Chung, EunKyung December 2006.

Context

This dissertation is part of the collection entitled: UNT Theses and Dissertations and was provided by UNT Libraries to Digital Library, a digital repository hosted by the UNT Libraries. It has been viewed 302 times , with 6 in the last month . More information about this dissertation can be viewed below.

Who

People and organizations associated with either the creation of this dissertation or its content.

Chairs

Committee Member

Publisher

Rights Holder

For guidance see Citations, Rights, Re-Use.

  • Chung, EunKyung

Provided By

UNT Libraries

With locations on the Denton campus of the University of North Texas and one in Dallas, UNT Libraries serves the school and the community by providing access to physical and online collections; The Portal to Texas History and UNT Digital Libraries; academic research, and much, much more.

Contact Us

What

Descriptive information to help identify this dissertation. Follow the links below to find similar items on the Digital Library.

Description

The purpose of dissertation is to examine whether the understandings of subject indexing processes conducted by human indexers have a positive impact on the effectiveness of automatic subject term assignment through text categorization (TC). More specifically, human indexers' subject indexing approaches or conceptions in conjunction with semantic sources were explored in the context of a typical scientific journal article data set. Based on the premise that subject indexing approaches or conceptions with semantic sources are important for automatic subject term assignment through TC, this study proposed an indexing conception-based framework. For the purpose of this study, three hypotheses were tested: 1) the effectiveness of semantic sources, 2) the effectiveness of an indexing conception-based framework, and 3) the effectiveness of each of three indexing conception-based approaches (the content-oriented, the document-oriented, and the domain-oriented approaches). The experiments were conducted using a support vector machine implementation in WEKA (Witten, & Frank, 2000). The experiment results pointed out that cited works, source title, and title were as effective as the full text, while keyword was found more effective than the full text. In addition, the findings showed that an indexing conception-based framework was more effective than the full text. Especially, the content-oriented and the document-oriented indexing approaches were found more effective than the full text. Among three indexing conception-based approaches, the content-oriented approach and the document-oriented approach were more effective than the domain-oriented approach. In other words, in the context of a typical scientific journal article data set, the objective contents and authors' intentions were more focused that the possible users' needs. The research findings of this study support that incorporation of human indexers' indexing approaches or conception in conjunction with semantic sources has a positive impact on the effectiveness of automatic subject term assignment.

Language

Identifier

Unique identifying numbers for this dissertation in the Digital Library or other systems.

Collections

This dissertation is part of the following collection of related materials.

UNT Theses and Dissertations

Theses and dissertations represent a wealth of scholarly and artistic content created by masters and doctoral students in the degree-seeking process. Some ETDs in this collection are restricted to use by the UNT community.

What responsibilities do I have when using this dissertation?

When

Dates and time periods associated with this dissertation.

Creation Date

  • December 2006

Added to The UNT Digital Library

  • May 5, 2008, 3:02 p.m.

Description Last Updated

  • June 27, 2017, 3:44 p.m.

Usage Statistics

When was this dissertation last used?

Yesterday: 0
Past 30 days: 6
Total Uses: 302

Interact With This Dissertation

Here are some suggestions for what to do next.

Start Reading

PDF Version Also Available for Download.

Citations, Rights, Re-Use

Chung, EunKyung. A Framework of Automatic Subject Term Assignment: An Indexing Conception-Based Approach, dissertation, December 2006; Denton, Texas. (digital.library.unt.edu/ark:/67531/metadc5473/: accessed December 15, 2017), University of North Texas Libraries, Digital Library, digital.library.unt.edu; .