Award ER25844: Minimizing System Noise Effects for Extreme-Scale Scientific Simulation Through Function Delegation Page: 4 of 6
This report is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to Digital Library by the UNT Libraries Government Documents Department.
The following text was automatically extracted from the image on this page using optical character recognition software:
LogGOPSim (previously LogGPSim) is a full LogGPS simulator. LogGPS is an established network
model for parallel applications and algorithms. The simulator is capable of reading MPI traces from
real MPI applications and replaying them with different LogGP parameters and injected system noise.
The gathered noise traces (NetGauge output) can be used as input for the simulator to achieve accurate
simulation results. The current version of the simulator supports full LogGPS, however we have
extended LogGOPSim with the principles of ORCS and we can now simulate congestion on real
networks and routings (together with the system noise). We have incorporated an efficient virtual
memory paging approach to enable LogGOPSim to simulate much larger systems. LogGOPSim is
publicly available and can read Netgauge noise traces and real application traces to simulate the
influence of OS noise on parallel applications. Further details about LogGOPSim and the full source-
code are available at https://www.unixer.de/research/LogGOPSim/
Our proposal of adding non-blocking collectives to the MPI-3 standard has been accepted, and is now
part of the draft MPI-3 standard. Should our proposal survive the upcoming final vote it will be
included in MPI-3.
(The MPI implementation MPICH already includes non-blocking collectives, and the implementation
was informed by the LibNBC implementation.)
1. Global Operation Assembly Language (GOAL)
We designed the Group Operation Assembly Language as a framework to define collective
communications in . We show the universality of this language and how it can be used to implement
all existing collective operations. By design, it readily lends itself to blocking and nonblocking
execution, as well as to off-loaded execution of complex group communication operations. We also
define several offline and online optimizations (compiler transformations and scheduling decisions,
respectively) to improve the overall performance of the operation. Performance results show that the
overhead to express current collective operations is negligible in comparison to the potential gains in a
highly optimized implementation.
2. LogGPS Analysis of Different Collective Communication Algorithms
In order to test LogGPSim, we conducted a series of experiments to simulate different collective
operations with different real networking (LogGPS) parameters . We use the gathered parameters to
simulate LogGPS models of collective operations and demonstrate the errors in common benchmarking
methods for collective operations. The simulations provide new insight into the nature of collective
algorithms and their pipelining properties. We show that the error grows linearly with the system size.
3. Noise Simulation Study
We used experimentally measured noise characteristics for six common HPC architectures, and
performed large-scale LogGPS simulations to characterize the performance of collective operations and
Here’s what’s next.
This report can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Report.
Lumsdaine, Andrew. Award ER25844: Minimizing System Noise Effects for Extreme-Scale Scientific Simulation Through Function Delegation, report, November 20, 2012; United States. (digital.library.unt.edu/ark:/67531/metadc840896/m1/4/: accessed January 21, 2019), University of North Texas Libraries, Digital Library, digital.library.unt.edu; crediting UNT Libraries Government Documents Department.