Search Results

Labelled PDF Dataset from UNT.edu

Description: This dataset contains a random sample of 2000 PDF documents from the Spring 2017 Web Archive of the unt.edu domain. (https://digital.library.unt.edu/ark:/67531/metadc993363/) that have been sorted into two categories, ForRepo and NotForRepo.
Date: November 15, 2017
Creator: Andrews, Pamela & Phillips, Mark Edward

Labeled PDF Dataset from Texas Records and Information Locator (TRAIL) Web Archive

Description: This dataset contains a random sample of 2000 PDF documents from the usda.gov domain in the End of Term (EOT) 2008 Web Archive. These samples were categorized as being of interest for possible inclusion in the Technical Report Archive and Image Library (TRAIL). Each PDF has been sorted into two categories, Technical_Report and Not_Technical_Report.
Date: July 2018
Creator: Kirkwood, Patricia; Phillips, Mark Edward & Caldwell, Christopher