Classifying Scientific Publications Using Abstract Features

Caragea, Cornelia; Silvescu, Adrian; Kataria, Saurabh; Caragea, Doina; Mitra, Prasenjit

Classifying Scientific Publications Using Abstract Features

PDF Version Also Available for Download.

Description

Article discussing classifying scientific publications using abstract features.

Physical Description

8 p.: ill.

Creation Information

Caragea, Cornelia; Silvescu, Adrian; Kataria, Saurabh; Caragea, Doina & Mitra, Prasenjit 2011.

Context

This paper is part of the collection entitled: UNT Scholarly Works and was provided by the UNT College of Engineering to the UNT Digital Library, a digital repository hosted by the UNT Libraries. It has been viewed 11496 times, with 34 in the last month. More information about this paper can be viewed below.

Authors

Caragea, Cornelia University of North Texas
Silvescu, Adrian Naviance, Inc.
Kataria, Saurabh Pennsylvania State University
Caragea, Doina Kansas State University
Mitra, Prasenjit Pennsylvania State University

Publisher

American Association for Artificial Intelligence
Place of Publication: [Palo Alto, California]

Provided By

UNT College of Engineering

The UNT College of Engineering strives to educate and train engineers and technologists who have the vision to recognize and solve the problems of society. The college comprises six degree-granting departments of instruction and research.

Degree Information

Department: Computer Science and Engineering

Description

Article discussing classifying scientific publications using abstract features.

Physical Description

8 p.: ill.

Notes

Abstract: With the exponential increase in the number of documents available online, e.g., news articles, weblogs, scientific documents, effective and efficient classification methods are required in order to deliver the appropriate information to specific users or groups. The performance of document classifiers critically depends, among other things, on the choice of the feature representation. The commonly used "bag of words" representation can result in a large number of features. Feature abstraction helps reduce a classifier input size by learning an abstraction hierarchy over the set of words. A cut through the hierarchy specifies a compressed model, where the nodes on the cut represent abstract features. In this paper, we compare feature abstraction with two other methods for dimensionality reduction, i.e., feature selection and Latent Dirichlet Allocation (LDA). Experimental results on two data sets of scientific publications show that classifiers trained using abstract features significantly outperform those trained using features that have the highest average mutual information with the class, and those trained using the topic distribution and topic words output by LDA. Furthermore, we propose an approach to automatic identification of a cut in order to trade off the complexity of classifiers against their performance. Our results demonstrate the feasibility of the proposed approach.

Subjects

Keywords

Source

Proceedings of the Symposium on Abstraction, Reformulation, and Approximation, 2011, Parador de Cardona, Spain

Language

English

Item Type

Paper

Identifier

Unique identifying numbers for this paper in the Digital Library or other systems.

Archival Resource Key: ark:/67531/metadc181686

Publication Information

Peer Reviewed: Yes

Collections

This paper is part of the following collection of related materials.

UNT Scholarly Works

Materials from the UNT community's research, creative, and scholarly activities and UNT's Open Access Repository. Access to some items in this collection may be restricted.

What responsibilities do I have when using this paper?

Creation Date

2011

Added to The UNT Digital Library

Sept. 13, 2013, 2:58 p.m.

Description Last Updated

Oct. 31, 2023, 2:49 p.m.

Usage Statistics

When was this paper last used?

Yesterday: 1

Past 30 days: 34

Total Uses: 11,496

Interact With This Paper

Here are some suggestions for what to do next.

Top Search Results

We found seven places within this paper that matched your search. View Now

Start Reading

Thumbnail image of item number 1 in: 'Classifying Scientific Publications Using Abstract Features'.

Thumbnail image of item number 2 in: 'Classifying Scientific Publications Using Abstract Features'.

Thumbnail image of item number 3 in: 'Classifying Scientific Publications Using Abstract Features'.

Thumbnail image of item number 4 in: 'Classifying Scientific Publications Using Abstract Features'.

PDF Version Also Available for Download.

All Formats

Citations, Rights, Re-Use

International Image Interoperability Framework

We support the IIIF Presentation API

Caragea, Cornelia; Silvescu, Adrian; Kataria, Saurabh; Caragea, Doina & Mitra, Prasenjit. Classifying Scientific Publications Using Abstract Features, paper, 2011; [Palo Alto, California]. (https://digital.library.unt.edu/ark:/67531/metadc181686/: accessed April 19, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT College of Engineering.

Classifying Scientific Publications Using Abstract Features

Description

Physical Description

Creation Information

Context

Who

Authors

Publisher

Provided By

UNT College of Engineering

Contact Us

What

Degree Information

Description

Physical Description

Notes

Subjects

Keywords

Source

Language

Item Type

Identifier

Publication Information

Collections

UNT Scholarly Works

Digital Files

When

Creation Date

Added to The UNT Digital Library

Description Last Updated

Usage Statistics

Interact With This Paper

Search Inside

Top Search Results

Start Reading

Citations, Rights, Re-Use

International Image Interoperability Framework

Print / Share

Links for Robots

Archival Resource Key (ARK)

International Image Interoperability Framework (IIIF)

Metadata Formats

Images

URLs

Stats