Leveraging Machine Learning to Extract Content-Rich Publications from Web Archives

PDF Version Also Available for Download.

Description

Poster presented at the 2019 Texas Conference on Digital Libraries (TCDL-2019). This poster discusses about ways of Identifying content-rich documents among the wealth of materials available via web archives. This research attempts to answers the following two research questions: 1. What role do web-published documents and publications play in developing collections in the broad categories of institutional repositories, state government documents, and publications from the federal government? 2. What are the characteristics of web-published documents and publications that help content selectors identify them for inclusion in their local collection

Physical Description

1 p. : ill.

Creation Information

Fox, Nathaniel T. & Phillips, Mark Edward May 22, 2019.

Context

This presentation is part of the collection entitled: UNT Scholarly Works and was provided by UNT Libraries to Digital Library, a digital repository hosted by the UNT Libraries. It has been viewed 19 times, with 19 in the last month. More information about this presentation can be viewed below.

Who

People and organizations associated with either the creation of this presentation or its content.

Provided By

UNT Libraries

The UNT Libraries serve the university and community by providing access to physical and online collections, fostering information literacy, supporting academic research, and much, much more.

Contact Us

What

Descriptive information to help identify this presentation. Follow the links below to find similar items on the Digital Library.

Description

Poster presented at the 2019 Texas Conference on Digital Libraries (TCDL-2019). This poster discusses about ways of Identifying content-rich documents among the wealth of materials available via web archives. This research attempts to answers the following two research questions: 1. What role do web-published documents and publications play in developing collections in the broad categories of institutional repositories, state government documents, and publications from the federal government? 2. What are the characteristics of web-published documents and publications that help content selectors identify them for inclusion in their local collection

Physical Description

1 p. : ill.

Source

  • Texas Conference on Digital Libraries (TCDL-2019), May 20-23, Austin, Texas.

Language

Item Type

Collections

This presentation is part of the following collection of related materials.

UNT Scholarly Works

Materials from the UNT community's research, creative, and scholarly activities and UNT's Open Access Repository. Access to some items in this collection may be restricted.

What responsibilities do I have when using this presentation?

When

Dates and time periods associated with this presentation.

Creation Date

  • May 22, 2019

Added to The UNT Digital Library

  • Aug. 27, 2019, 2:15 p.m.

Usage Statistics

When was this presentation last used?

Yesterday: 0
Past 30 days: 19
Total Uses: 19

Interact With This Presentation

Here are some suggestions for what to do next.

Enlarge

PDF Version Also Available for Download.

International Image Interoperability Framework

IIF Logo

We support the IIIF Presentation API

Fox, Nathaniel T. & Phillips, Mark Edward. Leveraging Machine Learning to Extract Content-Rich Publications from Web Archives, presentation, May 22, 2019; (https://digital.library.unt.edu/ark:/67531/metadc1533639/: accessed September 17, 2019), University of North Texas Libraries, Digital Library, https://digital.library.unt.edu; .