Search Results

[Dataset of Web Archiving Research Articles]

Description: Datasets used in the presentation, "Towards Building a Collection of Web Archiving Research Articles." The files included here were used to conduct several Machine Learning classification experiments that result in a corpus of scholarly research articles on the topic of web archiving.
Date: August 2014
Creator: Reyes Ayala, Brenda & Caragea, Cornelia
Partner: UNT College of Information

#DescribeTrumpWithOneWord Twitter Dataset

Description: This dataset contains Twitter JSON data for Tweets related to the hashtag #DescribeTrumpWithOneWord. This dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 15,676 Tweets make up the combined dataset.
Date: 2017-09-02/2017-09-22
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Development of Facultative Air Breathing in Bristlenose Plecos (Ancistrus cirrhosus)

Description: Data collected on air breathing development in the bristlenose pleco. Bristlenose plecos breath air with a highly vascularized stomach when exposed to aquatic hypoxic conditions. This study looked at the development of this behavior and when the fish fist began to breathe air.
Date: January 2024
Creator: Crowder, Lauren W. & Dzialowski, Edward M. (Edward Michael)
Partner: UNT College of Science

#DiaperDon Twitter Dataset

Description: This dataset contains Twitter JSON data for Tweets related to the hashtag #DiaperDon. This dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 866,987 Tweets make up the combined dataset.
Date: 2020-11-18/2020-12-01
Creator: Phillips, Mark Edward
Partner: UNT Libraries

ERCOT/2021 Texas Power Crisis Twitter Dataset

Description: This dataset contains Twitter JSON data for Tweets related to the This dataseic Reliability Countil of Texas (ERCOT) during the 2021 Texas power crisis from February 10th, thru February 27th, 2021. The dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 612,082 Tweets make up the combined dataset.
Date: 2021-02-09/2021-02-24
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Ethics Gaming Survey Results

Description: Dataset generated for a National Science Foundation grant project, "EAGER: Prototyping a Virtue Ethics Game." These files contain the research results of the pre-test and post-test surveys.
Date: August 29, 2013
Creator: Oppong, Joseph R.
Partner: UNT College of Arts and Sciences

Extended Date/Time Format (EDTF) Dates Research Datasets

Description: Two datasets, each with 390,751 date samples from the UNT Libraries' digital collections. These samples were compiled for research regarding the Extended Date/Time Format (EDTF) standard. The first dataset contains a concatenated list of date values from the metadata records in The Portal to Texas History, the UNT Digital Library, and The Gateway to Oklahoma History. The "classified" dataset includes labels expressing whether each date is EDTF-valid and the level of conformance.
Date: February 28, 2013
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Gaming Census Dataset

Description: This dataset represents survey feedback gathered about games in libraries, collections, cataloging, outreach, and programming.
Date: December 3, 2018
Creator: Brannon, Sian; Robson, Diane & Dewitt-Miller, Erin
Partner: UNT Libraries

Hurricane Dorian Twitter Dataset

Description: This dataset contains Twitter JSON data for Tweets related to Hurricane Dorian which is the most intense tropical cyclone on record to strike the Bahamas, and is regarded as the worst natural disaster in the country's history. This dataset was created using the twarc (https://github.com/DocNow/twarc) package that makes use of Twitter's search API. A total of 3,000,553 Tweets and 84,216 media files make up the combined dataset.
Date: 2019-08-25/2019-09-14
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Hurricane Florence Twitter Dataset

Description: This dataset contains Twitter JSON data for Tweets related to Hurricane Florence and the subsequent flooding along the Carolina coastal region. This dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 4,971,575 Tweets and 347,205 media files make up the combined dataset.
Date: 2018-09-05/2018-10-03
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Hurricane Harvey Twitter Dataset

Description: This dataset contains Twitter JSON data for Tweets related to Hurricane Harvey and the subsequent flooding along the Texas gulf region. This dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 7,041,866 Tweets make up the combined dataset.
Date: 2017-08-18/2017-09-22
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Hurricane Ida Twitter Dataset

Description: This dataset contains Twitter JSON data for Tweets related to Hurricane Ida which was a deadly and distructive Category 4 Atlantic hurricane that made landfall in Lousiana in 2021. This dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 1,868,703 Tweets make up the combined dataset.
Date: 2021-08-20/2021-09-22
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Hurricane Laura Twitter Dataset

Description: This dataset contains Twitter JSON data for Tweets related to Hurricane Laura that formed August 20, 2020 and dissipated August 29, 2020. This dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 1,168,178 Tweets make up the combined dataset.
Date: 2020-08-18/2020-09-02
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Hydroxychloroquine Twitter Dataset

Description: This dataset contains Twitter JSON data for several Twitter search queries that were collected related to the drug hydroxychloroquine and its relationship as an effective coronavirus treatment. This dataset was created to capture the opinions on Twitter after a group of people calling themselves "America’s Frontline Doctors" released a video sharing misleading claims about the virus and the drugs use as an effective treatment. This dataset was created using the twarc (https://github.com/DocNow/… more
Date: 2020-07-20/2020-08-11
Creator: Phillips, Mark Edward
Partner: UNT Libraries

John Lewis Twitter Dataset

Description: This dataset contains Twitter JSON data for several Twitter search queries that were collected following the death on July 17, 2020, of American politician and civil-rights leader John Lewis, who served in the United States House of Representatives for Georgia's 5th congressional district from 1987 until his death. This dataset was created using the twarc (https://github.com/DocNow/twarc) package that makes use of Twitter's search API. A total of 6,870,881 Tweets and 42,055 media files make up … more
Date: 2020-07-10/2020-08-10
Creator: Phillips, Mark Edward
Partner: UNT Libraries

#Kaepernick7 and #ISupportKaepernickBecause Twitter Dataset

Description: This dataset contains Twitter JSON data for Tweets related to the hashtags #Kaepernick7 and ISupportKaepernickBecause This dataset was created using the twarc (https://github.com/edsu/twarc) package that makes use of Twitter's search API. A total of 573,379 Tweets make up the combined dataset.
Date: 2016-08-20/2016-08-31
Creator: Phillips, Mark Edward
Partner: UNT Libraries

Labeled PDF Dataset from End of Term (EOT) 2008 Web Archive

Description: This dataset contains a random sample of 2000 PDF documents from the usda.gov domain in the End of Term (EOT) 2008 Web Archive. These samples were categorized as being of interest for possible inclusion in the Technical Report Archive and Image Library (TRAIL). Each PDF has been sorted into two categories, Technical_Report and Not_Technical_Report.
Date: July 2018
Creator: Kirkwood, Patricia; Phillips, Mark Edward & Caldwell, Christopher
Partner: UNT Libraries

Labeled PDF Dataset from UNT.edu

Description: This dataset contains a random sample of 2000 PDF documents from the Spring 2017 Web Archive of the unt.edu domain. (https://digital.library.unt.edu/ark:/67531/metadc993363/) that have been sorted into two categories, ForRepo and NotForRepo.
Date: November 15, 2017
Creator: Andrews, Pamela & Phillips, Mark Edward
Partner: UNT Libraries
Back to Top of Screen