The authors describe a methodology that enables the real-time diagnosis of performance problems in complex high-performance distributed systems. The methodology includes tools for generating precision event logs that can be used to provide detailed end-to-end application and system level monitoring; a Java agent-based system for managing the large amount of logging data; and tools for visualizing the log data and real-time state of the distributed system. The authors developed these tools for analyzing a high-performance distributed system centered around the transfer of large amounts of data at high speeds from a distributed storage server to a remote visualization client. However, ...
continued below
Publisher Info:
Lawrence Berkeley National Lab., CA (United States)
Place of Publication:
California
Provided By
UNT Libraries Government Documents Department
Serving as both a federal and a state depository library, the UNT Libraries Government Documents Department maintains millions of items in a variety of formats. The department is a member of the FDLP Content Partnerships Program and an Affiliated Archive of the National Archives.
Descriptive information to help identify this report.
Follow the links below to find similar items on the Digital Library.
Description
The authors describe a methodology that enables the real-time diagnosis of performance problems in complex high-performance distributed systems. The methodology includes tools for generating precision event logs that can be used to provide detailed end-to-end application and system level monitoring; a Java agent-based system for managing the large amount of logging data; and tools for visualizing the log data and real-time state of the distributed system. The authors developed these tools for analyzing a high-performance distributed system centered around the transfer of large amounts of data at high speeds from a distributed storage server to a remote visualization client. However, this methodology should be generally applicable to any distributed system. This methodology, called NetLogger, has proven invaluable for diagnosing problems in networks and in distributed systems code. This approach is novel in that it combines network, host, and application-level monitoring, providing a complete view of the entire system.
This report is part of the following collection of related materials.
Office of Scientific & Technical Information Technical Reports
Reports, articles and other documents harvested from the Office of Scientific and Technical Information.
Office of Scientific and Technical Information (OSTI) is the Department of Energy (DOE) office that collects, preserves, and disseminates DOE-sponsored research and development (R&D) results that are the outcomes of R&D projects or other funded activities at DOE labs and facilities nationwide and grantees at universities and other institutions.
Tierney, Brian; Johnston, William; Crowley, Brian; Hoo, Gary; Brooks, Chris & Gunter, Dan.The NetLogger Methodology for High Performance Distributed Systems Performance Analysis,
report,
December 23, 1999;
California.
(digital.library.unt.edu/ark:/67531/metadc718330/:
accessed April 25, 2018),
University of North Texas Libraries, Digital Library, digital.library.unt.edu;
crediting UNT Libraries Government Documents Department.