End of Term 2008 Presidential Web Archive: PDF Content Analysis

Description:

This presentation discusses the End of Term 2008 Presidential Web Archive. The University of North Texas (UNT) Libraries collaborated with members of the International Internet Preservation Consortium (IIPC) on the End of Term 2008 Presidential Web Harvest from October, 2008 to February, 2009. The project team archived 160,211,356 URIs during this collaboration, which became a research dataset for an IMLS-funded grant to investigate collection development using web archives. The project team analyzed the 10,318,073 PDFs and developed a retrieval and exploration system for collection developers interested in acquiring and developing born-digital collections from the End of Term Web Archive.

Creator(s): Phillips, Mark Edward
Creation Date: December 5, 2012
Partner(s):
UNT Libraries
Collection(s):
UNT Scholarly Works
Usage:
Total Uses: 65
Past 30 days: 5
Yesterday: 0
Creator (Author):
Phillips, Mark Edward

University of North Texas

Date(s):
  • Creation: December 5, 2012
Description:

This presentation discusses the End of Term 2008 Presidential Web Archive. The University of North Texas (UNT) Libraries collaborated with members of the International Internet Preservation Consortium (IIPC) on the End of Term 2008 Presidential Web Harvest from October, 2008 to February, 2009. The project team archived 160,211,356 URIs during this collaboration, which became a research dataset for an IMLS-funded grant to investigate collection development using web archives. The project team analyzed the 10,318,073 PDFs and developed a retrieval and exploration system for collection developers interested in acquiring and developing born-digital collections from the End of Term Web Archive.

Degree:
Department: Libraries
Note:

Abstract: This presentation discusses the End of Term 2008 Presidential Web Archive. The University of North Texas (UNT) Libraries collaborated with members of the International Internet Preservation Consortium (IIPC) on the End of Term 2008 Presidential Web Harvest from October, 2008 to February, 2009. The project team archived 160,211,356 URIs during this collaboration, which became a research dataset for an IMLS-funded grant to investigate collection development using web archives. The project team analyzed the 10,318,073 PDFs and developed a retrieval and exploration system for collection developers interested in acquiring and developing born-digital collections from the End of Term Web Archive.

Physical Description:

104 p.

Language(s):
Subject(s):
Keyword(s): archives | harvests | End of Term Archives | Presidential campaigns
Source: Best Practices Exchange Conference, 2012, Annapolis, Maryland, United States
Partner:
UNT Libraries
Collection:
UNT Scholarly Works
Identifier:
  • ARK: ark:/67531/metadc130188
Resource Type: Presentation
Format: Image
Rights:
Access: Public