Mapping Texts: Combining Text-Mining and Geo-Visualization To Unlock The Research Potential of Historical Newspapers Page: 1
The following text was automatically extracted from the image on this page using optical character recognition software:
MAPPING TEXTS: COMBINING TEXT-MINING AND GEO-VISUALIZATION TO
UNLOCK THE RESEARCH POTENTIAL OF HISTORICAL NEWSPAPERS
A White Paper for the National Endowment for the Humanities
Andrew J. Torget Rada Mihalcea Jon Christensen Geoff McGhee
University of North Texas University of North Texas Stanford University Stanford University
firstname.lastname@example.org email@example.com firstname.lastname@example.org email@example.com
In September 2010, the University of North Texas (in partnership with Stanford University) was
awarded a National Endowment for the Humanities Level II Digital Humanities Start-Up Grant
(Award #HD-51188-10) to develop a series of experimental models for combining the possibilities
of text-mining with geospatial mapping in order to unlock the research potential of large-scale
collections of historical newspapers. Using a sample of approximately 230,000 pages of historical
newspapers from the Chronicling America digital newspaper database, we developed two
interactive visualizations of the language content of these massive collections of historical
documents as they spread across both time and space: one measuring the quantity and quality of
the digitized content, and a second measuring several of the most widely used large-scale
language pattern metrics common in natural language processing work. This white paper
documents those experiments and their outcomes, as well as our recommendations for future
Here’s what’s next.
This paper can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Paper.
Torget, Andrew J., 1978-; Mihalcea, Rada, 1974-; Christensen, Jon & McGhee, Geoff. Mapping Texts: Combining Text-Mining and Geo-Visualization To Unlock The Research Potential of Historical Newspapers, paper, 2011; (digital.library.unt.edu/ark:/67531/metadc83797/m1/1/: accessed August 20, 2017), University of North Texas Libraries, Digital Library, digital.library.unt.edu; crediting UNT College of Arts and Sciences.