Mapping Texts: Combining Text-Mining and Geo-Visualization To Unlock The Research Potential of Historical Newspapers

Description:

Paper on mapping texts and combining text-mining and geo-visualization to unlock the research potential of historical newspapers.

Creator(s):
Creation Date: 2011
Partner(s):
UNT College of Arts and Sciences
Collection(s):
UNT Scholarly Works
Usage:
Total Uses: 433
Past 30 days: 12
Yesterday: 1
Creator (Author):
Torget, Andrew J., 1978-

University of North Texas

Creator (Author):
Mihalcea, Rada, 1974-

University of North Texas

Creator (Author):
Christensen, Jon

Stanford University

Creator (Author):
McGhee, Geoff

Stanford University

Date(s):
  • Creation: 2011
Description:

Paper on mapping texts and combining text-mining and geo-visualization to unlock the research potential of historical newspapers.

Degree:
Department: History
Note:

Abstract: In September 2010, the University of North Texas (in partnership with Stanford University) was awarded a National Endowment for the Humanities Level II Digital Humanities Start-up Grant (Award #HD-51188-10) to develop a series of experimental models for combining the possibilities of text-mining with geospatial mapping in order to unlock the research potential of large-scale collections of historical newspapers. Using a sample of approximately 230,000 pages of historical newspapers from the 'Chronicling America' digital newspaper database, we developed two interactive visualizations of the language content of these massive collections of historical documents as they spread across both time and space: one measuring the quantity and quality of the digitized content, and a second measuring several of the most widely used large-scale language pattern metrics common in natural language processing work. This white paper documents those experiments and their outcomes, as well as our recommendations for future work.

Physical Description:

53 p.

Language(s):
Subject(s):
Keyword(s): text-mining | geo-visualization | newspapers | historical documents
Source: National Endowment for the Humanities Level II Digital Humanities Start-Up Grant
Contributor(s):
Series Title: Mapping Texts
Partner:
UNT College of Arts and Sciences
Collection:
UNT Scholarly Works
Identifier:
  • GRANTNO: HD-51188-10
  • ARK: ark:/67531/metadc83797
Resource Type: Paper
Format: Text
Rights:
Access: Public