Search Results

Portal to Texas History Newspaper OCR Text Dataset: Denton

Description: Dataset of OCR text from The Portal to Texas History and the Texas Digital Newspaper Program. This dataset includes titles from Denton Texas from the years 1892 to 1911. Titles included in this dataset include: Denton County News, Denton County Record and Chronicle, Denton Evening News, Legal Tender, Record and Chronicle, The Denton County Record, and The Denton Monitor. In all there are 690 issues comprised of 4,686 pages of text.
Date: November 12, 2015
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Portal to Texas History Newspaper OCR Text Dataset: El Paso

Description: Dataset of OCR text from The Portal to Texas History and the Texas Digital Newspaper Program. This dataset includes titles from El Paso Texas from the years 1881 to 1921. Titles included in this dataset include: El Paso Daily Herald, El Paso Daily Times, El Paso Herald, El Paso International Daily Times, El Paso Morning Times, El Paso Sunday Times, El Paso Times, The El Paso Daily Times, and The El Paso Time. In all there are 17,104 issues comprised of 177,640 pages of text.
Date: November 12, 2015
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Portal to Texas History Newspaper OCR Text Dataset: Fort Worth

Description: Dataset of OCR text from The Portal to Texas History and the Texas Digital Newspaper Program. This dataset includes titles from Fort Worth Texas from the years 1883 to 1896. Titles included in this dataset include: Fort Worth Daily Gazette, Fort Worth Gazette, and Fort Worth Weekly Gazette. In all there are 4,146 issues comprised of 36,199 pages of text.
Date: November 12, 2015
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Portal to Texas History Newspaper OCR Text Dataset: Gainesville

Description: Dataset of OCR text from The Portal to Texas History and the Texas Digital Newspaper Program. This dataset includes titles from Gainesville Texas from the years 1888 to 1897. Titles included in this dataset include: The Daily Hesperian, and The Gainesville Daily Hesperian. In all there are 2,286 issues comprised of 9,359 pages of text.
Date: November 12, 2015
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Portal to Texas History Newspaper OCR Text Dataset: Galveston

Description: Dataset of OCR text from The Portal to Texas History and the Texas Digital Newspaper Program. This dataset includes titles from Galveston Texas from the years 1849 to 1897. Titles included in this dataset include: Galveston Weekly News, and The Galveston Daily News. In all there are 8,136 issues comprised of 56,953 pages of text.
Date: November 12, 2015
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Portal to Texas History Newspaper OCR Text Dataset: Houston

Description: Dataset of OCR text from The Portal to Texas History and the Texas Digital Newspaper Program. This dataset includes titles from Houston, Texas from the years 1893 to 1924. Titles included in this dataset include: The Houston Daily Post and The Houston Post. In all there are 9,855 issues comprised of 184,900 pages of text.
Date: November 12, 2015
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Portal to Texas History Newspaper OCR Text Dataset: McKinney

Description: Dataset of OCR text from The Portal to Texas History and the Texas Digital Newspaper Program. This dataset includes titles from McKinney Texas from the years 1880 to 1936. Titles included in this dataset include: Collin County Mercury, McKinney Weekly Democrat-Gazette, The Daily Courier, The Daily Gazette, The Democrat, The Democrat-Gazette, The Lion Roar, The McKinney Advocate, The McKinney Examiner, The McKinney Gazette, The Semi-Weekly Courier, The Southern Jerseyite, and The Weekly Democr… more
Date: November 12, 2015
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Portal to Texas History Newspaper OCR Text Dataset: San Antonio

Description: Dataset of OCR text from The Portal to Texas History and the Texas Digital Newspaper Program. This dataset includes titles from San Antonio Texas from the years 1874 to 1920. Titles included in this dataset include: San Antonio Daily Express, San Antonio Daily Light, San Antonio Express, The Daily Express, and The San Antonio Light. In all there are 6,866 issues comprised of 130,726 pages of text.
Date: November 12, 2015
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Portal to Texas History Newspaper OCR Text Dataset: Temple

Description: Dataset of OCR text from The Portal to Texas History and the Texas Digital Newspaper Program. This dataset includes titles from Temple Texas from the years 1907 to 1922. Titles included in this dataset include: Temple Daily Telegram. In all there are 4,627 issues comprised of 44,633 pages of text.
Date: November 12, 2015
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Quality Assurance Practices in Web Archiving [Dataset]

Description: This dataset contains the results of a survey of quality assurance practices within the field of web archiving and its practitioners. To understand current QA practices, the authors surveyed institutions engaged in web archiving, which included national libraries, colleges and universities, and museums and art libraries. The survey was administered online. It includes the completed responses of 54 participants. The data has been anonymized for privacy reasons. This dataset was used in the "Curr… more
Date: December 2014
Creator: Reyes Ayala, Brenda; Phillips, Mark Edward & Ko, Lauren
Partner: UNT Libraries

[Response Data: Survey of Benchmarks in Metadata Quality]

Description: Complete, anonymized dataset of responses to the Survey of Benchmarks in Metadata Quality. Date, time, IP addresses, and geographic data has been omitted. Responses that included project, organization, and/or repository names were removed from this data, as well as potentially identifying names, acronyms, and/or links.
Date: July 2019
Creator: Digital Library Federation. Assessment Interest Group. Metadata Working Group. Benchmarks Sub-Group.
Partner: UNT Libraries

"Stand With Wendy" Twitter Dataset

Description: This dataset contains Twitter JSON data for several Twitter search queries collected the week following the filibuster by Wendy Davis in the Texas Senate related to Senate Bill 5, using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 560,954 Tweets make up the combined dataset.
Date: 2013-06-25/2013-07-03
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Texas Digital Newspaper Program Issue Dataset for IFLA/Rootstech Analysis

Description: This dataset contains the descriptive metadata harvested from the Texas Digital Newspaper Program collection on The Portal to Texas History and is accompanied by a dataset derived from the harvested metadata. This dataset was used for an IFLA Newspaper Section and Rootstech presentation.
Date: January 16, 2014
Creator: Phillips, Mark Edward & Krahmer, Ana
Partner: UNT Libraries

Tropical Storm Imelda Twitter Dataset

Description: This dataset contains Twitter JSON data for Tweets related to Tropical Storm Imelda and the subsequent flooding in the south Texas region. This dataset was created using the twarc (https://github.com/DocNow/twarc) package that makes use of Twitter's search API. A total of 76,420 Tweets and 4,429 media files make up the combined dataset.
Date: 2019-09-10/2019-09-21
Creator: Phillips, Mark Edward
Partner: UNT Libraries
Back to Top of Screen