Multivariate geographic clustering on the world`s first zero price/performance parallel computer Page: 4 of 9
This article is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to Digital Library by the UNT Libraries Government Documents Department.
The following text was automatically extracted from the image on this page using optical character recognition software:
divisions, based on some measure of similarity, into all possible numbers of groups, from one single
group which contains all objects to as many groups as there are objects, with each object being in a
group by itself. Hierarchical clustering, which results in a complete similarity tree, is
computer-intensive, and the assemblage to be classified is limited to relatively few objects.
Non-hierarchical clustering provides only a single, user-specified level of division into groups; however,
it can be used to classify a much larger number of objects.
Multivariate geographic clustering employs non-hierarchical clustering on the individual pixels in a
digital map from a Geographic Information System for the purpose of classifying the cells into types or
categories. The classification of satellite imagery into land cover or vegetation classes using spectral
characteristics of each cell from multiple images taken at different wavelengths is a common example of
multivariate geographic clustering. Rarely, however, is non-hierarchical clustering performed on map
cell characteristics aside from spectral reflectance values.
Maps showing the suitability or characterization of regions are used for many purposes, including
identifying appropriate ranges for particular plant and animal species, identifying suitable crops for an
area or identifying a suitable area for a given crop, and identifying Plant Hardiness Zones for gardeners.
In addition, ecologists have long used the concept of the ecoregion, an area within which there are
similar ecological conditions, as a tool for understanding large geographic areas (Bailey, 1983, 1994,
1995, 1996; Omernick, 1986). Such regionalization maps, however, are usually prepared by individual
experts in a rather subjective way, and are essentially objectifications of expert opinion.
Our goal was to make repeatable the process of map regionalization based not on spectral cell
characteristics, but on characteristics identified as important to the growth of woody vegetation. By
using non-hierarchical multivariate geographic clustering we intended to produce several maps of
ecoregions across the entire nation at a resolution of one kilometer per cell. At this resolution, the 48
conterminous United States contains over 7.7 million map cells. Nine characteristics from three
categories--elevation, edaphic (or soil) factors, and climatic factors--were identified as important. The
edaphic factors are 1) plant-available water capacity, 2) soil organic matter, 3) total Kjeldahl soil
nitrogen, and 4) depth to seasonally-high water table. The climatic factors are 1) mean precipitation
during the growing season, 2) mean solar insolation during the growing season, 3) degree-day heat sum
during the growing season, and 4) degree-day cold sum during the non-growing season. The growing
season is defined by the frost-free period between mean day of first and last frost each year. A map for
each of these characteristics was generated from best-available data at a 1 km resolution for input into
the clustering process (Hargrove and Luxmoore, 1998). Given the size of this input data and the
significant amount of computer time typically required to perform statistical clustering, we decided a
parallel computer was needed for this task.
The Stone SouperComputer
Because of the geographic clustering application and other computational research opportunities, a
proposal for internal funding was developed which would support the construction of a Beowulf-style
cluster of new PCs, the first such system at Oak Ridge National Laboratory, and the development of the
necessary software algorithms. With the proposal rejected and significant effort already expended, we
chose to build a cluster anyway using the resources that were readily available: surplus Intel 486 PCs
destined for salvage.
Commandeering a nearly-abandoned computer room and scavenging as many surplus machines as
Here’s what’s next.
This article can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Article.
Hoffman, F. M.; Hargrove, W. W. & Schultz, A. J. Multivariate geographic clustering on the world`s first zero price/performance parallel computer, article, October 1998; Tennessee. (https://digital.library.unt.edu/ark:/67531/metadc710846/m1/4/: accessed April 21, 2019), University of North Texas Libraries, Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.