This presentation discusses improving access to web archives through innovative analysis of PDF content. It includes a background of the End of Term (EOT) 2008 Presidential Web Archive, a collaborative web archiving project, collection development with web archive content, and the workflow and processes involved in these projects.
The UNT Libraries serve the university and community by providing access to physical and online collections, fostering information literacy, supporting academic research, and much, much more.
This presentation discusses improving access to web archives through innovative analysis of PDF content. It includes a background of the End of Term (EOT) 2008 Presidential Web Archive, a collaborative web archiving project, collection development with web archive content, and the workflow and processes involved in these projects.
Improving Access to Web Archives through Innovative Analysis of PDF Content, ark:/67531/metadc155622
Collections
This presentation is part of the following collection of related materials.
UNT Scholarly Works
Materials from the UNT community's research, creative, and scholarly activities and UNT's Open Access Repository. Access to some items in this collection may be restricted.
This paper discusses improving access to web archives through innovative analysis of PDF content. The paper discusses the overall workflow and describes the tools used to extract document features. Findings suggest opportunities for the development of retrieval tools that will provide new ways of selecting content and building collections from large Web archives.
Relationship to this item: (Is Version Of)
Improving Access to Web Archives through Innovative Analysis of PDF Content, ark:/67531/metadc155622