Co-Training for Topic Classification of Scholarly Data

PDF Version Also Available for Download.

Description

This paper describes a co-training approach that uses the text and citation information of a research article as two different views to predict the topic of an article.

Physical Description

10 p.

Creation Information

Caragea, Cornelia; Bulgarov, Florin & Mihalcea, Rada September 2015.

Context

This paper is part of the collection entitled: UNT Scholarly Works and was provided by UNT College of Engineering to Digital Library, a digital repository hosted by the UNT Libraries. It has been viewed 51 times , with 9 in the last month . More information about this paper can be viewed below.

Who

People and organizations associated with either the creation of this paper or its content.

Authors

Publisher

Provided By

UNT College of Engineering

The UNT College of Engineering strives to educate and train engineers and technologists who have the vision to recognize and solve the problems of society. The college comprises six degree-granting departments of instruction and research.

Contact Us

What

Descriptive information to help identify this paper. Follow the links below to find similar items on the Digital Library.

Degree Information

Description

This paper describes a co-training approach that uses the text and citation information of a research article as two different views to predict the topic of an article.

Physical Description

10 p.

Notes

Abstract: With the exponential growth of scholarly data during the past few years, effective methods for topic classification are greatly needed. Current approaches usually require large amounts of expensive labeled data in order to make accurate predictions. In this paper, we posit that, in addition to a research article’s textual content, its citation network also contains valuable information. We describe a co-training approach that uses the text and citation information of a research article as two different views to predict the topic of an article. We show that this method improves significantly over the individual classifiers, while also bringing a substantial reduction in the amount of labeled data required for training accurate classifiers.

Source

  • 2015 Conference on Empirical Methods in Natural Language Processing, September 17-21, 2015. Lisbon, Portugal.

Language

Item Type

Publication Information

  • Publication Title: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
  • Pages: 2357-2366
  • Peer Reviewed: Yes

Collections

This paper is part of the following collection of related materials.

UNT Scholarly Works

Materials from the UNT community's research, creative, and scholarly activities and UNT's Open Access Repository. Access to some items in this collection may be restricted.

What responsibilities do I have when using this paper?

When

Dates and time periods associated with this paper.

Creation Date

  • September 2015

Added to The UNT Digital Library

  • Aug. 31, 2017, 5:38 p.m.

Usage Statistics

When was this paper last used?

Yesterday: 0
Past 30 days: 9
Total Uses: 51

Interact With This Paper

Here are some suggestions for what to do next.

Start Reading

PDF Version Also Available for Download.

International Image Interoperability Framework

IIF Logo

We support the IIIF Presentation API

Caragea, Cornelia; Bulgarov, Florin & Mihalcea, Rada. Co-Training for Topic Classification of Scholarly Data, paper, September 2015; Stroudsburg, Pennsylvania. (digital.library.unt.edu/ark:/67531/metadc991478/: accessed December 16, 2018), University of North Texas Libraries, Digital Library, digital.library.unt.edu; crediting UNT College of Engineering.