Scalable Systems Software Enabling Technology Center Page: 2 of 8
This report is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to Digital Library by the UNT Libraries Government Documents Department.
The following text was automatically extracted from the image on this page using optical character recognition software:
NCSA's role in the SCIDAC Scalable Systems Software (SSS) project was to develop
interfaces and communication mechanisms for systems monitoring, and to implement a
prototype demonstrating those standards. The Scalable Systems Monitoring component of the
SSS suite was designed to provide a large volume of both static and dynamic systems data to
the components within the SSS infrastructure as well as external data consumers.
Scalability is central to the design of this component. The number of devices in high performance
computing systems has been dramatically increasing for large installations over the last few years.
Concurrently, the availability of high quality data at the device level has expanded significantly as well.
Network switches, interconnect switches, power controllers, host adapters, storage systems and many
other devices incorporate not only performance information, but other data that is useful in predictive
failure analysis: such as temperature, voltage, transient memory/CPU errors and fan speed information.
In traditional systems, there are multiple separate and often overlapping infrastructures to gather
and interpret this information without a common interface to provide the data to its consumers. Resource
managers, performance monitors, and administration/health monitors often use independent mechanisms
for collecting their own information without any aggregation or sharing. This project addressed a
common interface specification for the collection and distribution of device data in an extensible and
This software component had three phases of the development cycle. Stage one involves
designing a monitoring system prototype that collects the necessary data for the other Scalable Systems
Software components it must interact with, and to define an extensible XML interface. In addition, this
interface was designed and tested to provide a framework that accommodates the expansion of new data
types and devices to be monitored. Existing software used for the collection and visualization of system
performance data were adapted to integrate with this new communication mechanism, and new
applications were tested to demonstrate the flexibility of the component interface. The data was viewed
graphically using existing tools developed locally. In addition to the collection of system performance
data, an application had been developed to view the registration and communications between the
software systems contained within the SciDAC Scalable Systems Software project. This application was
helpful in the debugging phase of component interactions, and was also is a visual aid in demonstrating
the communication paths within the entire scalable systems software stack.
Here’s what’s next.
This report can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Report.
Showerman, Michael T. Scalable Systems Software Enabling Technology Center, report, April 6, 2009; United States. (https://digital.library.unt.edu/ark:/67531/metadc930563/m1/2/: accessed April 18, 2019), University of North Texas Libraries, Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.