Classifying Scientific Publications Using Abstract Features

PDF Version Also Available for Download.

Description

Article discussing classifying scientific publications using abstract features.

Physical Description

8 p.: ill.

Creation Information

Caragea, Cornelia; Silvescu, Adrian; Kataria, Saurabh; Caragea, Doina & Mitra, Prasenjit 2011.

Context

This paper is part of the collection entitled: UNT Scholarly Works and was provided by UNT College of Engineering to Digital Library, a digital repository hosted by the UNT Libraries. It has been viewed 83 times . More information about this paper can be viewed below.

Who

People and organizations associated with either the creation of this paper or its content.

Authors

Publisher

Provided By

UNT College of Engineering

The UNT College of Engineering promotes intellectual and scholarly pursuits in the areas of computer science and engineering, preparing innovative leaders in a variety of disciplines. The UNT College of Engineering encourages faculty and students to pursue interdisciplinary research among numerous subjects of study including databases, numerical analysis, game programming, and computer systems architecture.

Contact Us

What

Descriptive information to help identify this paper. Follow the links below to find similar items on the Digital Library.

Degree Information

Description

Article discussing classifying scientific publications using abstract features.

Physical Description

8 p.: ill.

Notes

Copyright © 2011, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Abstract: With the exponential increase in the number of documents available online, e.g., news articles, weblogs, scientific documents, effective and efficient classification methods are required in order to deliver the appropriate information to specific users or groups. The performance of document classifiers critically depends, among other things, on the choice of the feature representation. The commonly used "bag of words" representation can result in a large number of features. Feature abstraction helps reduce a classifier input size by learning an abstraction hierarchy over the set of words. A cut through the hierarchy specifies a compressed model, where the nodes on the cut represent abstract features. In this paper, we compare feature abstraction with two other methods for dimensionality reduction, i.e., feature selection and Latent Dirichlet Allocation (LDA). Experimental results on two data sets of scientific publications show that classifiers trained using abstract features significantly outperform those trained using features that have the highest average mutual information with the class, and those trained using the topic distribution and topic words output by LDA. Furthermore, we propose an approach to automatic identification of a cut in order to trade off the complexity of classifiers against their performance. Our results demonstrate the feasibility of the proposed approach.

Source

  • Proceedings of the Symposium on Abstraction, Reformulation, and Approximation, 2011, Parador de Cardona, Spain

Language

Item Type

Publication Information

  • Peer Reviewed: Yes

Collections

This paper is part of the following collection of related materials.

UNT Scholarly Works

Materials from the UNT community's research, creative, and scholarly activities and UNT's Open Access Repository. Access to some items in this collection may be restricted.

What responsibilities do I have when using this paper?

When

Dates and time periods associated with this paper.

Creation Date

  • 2011

Added to The UNT Digital Library

  • Sept. 13, 2013, 2:58 p.m.

Description Last Updated

  • March 27, 2014, 11:37 a.m.

Usage Statistics

When was this paper last used?

Yesterday: 0
Past 30 days: 1
Total Uses: 83

Interact With This Paper

Here are some suggestions for what to do next.

Start Reading

PDF Version Also Available for Download.

Citations, Rights, Re-Use

Caragea, Cornelia; Silvescu, Adrian; Kataria, Saurabh; Caragea, Doina & Mitra, Prasenjit. Classifying Scientific Publications Using Abstract Features, paper, 2011; [Palo Alto, California]. (digital.library.unt.edu/ark:/67531/metadc181686/: accessed August 23, 2017), University of North Texas Libraries, Digital Library, digital.library.unt.edu; crediting UNT College of Engineering.