2 Matching Results

Search Results

Advanced search parameters have been applied.

Network fault tolerance in LA-MPI

Description: LA-MPI is a high-performance, network-fault-tolerant implementation of MPl designcd for terascale clusters that are inherently unreliable due to their very large number of system components and to trade-offs between cost and pcrformance. This paper reviews the architectural design of LA-MPI, focusing on our approach to guaranteeing data integrity. We discuss our network data path abstraction that makes LA-MPI highly portable, givcs high-performance through mcssage striping, and niost importantly provides the basis for network fault tolerance. Finally we include some performance numbers for the Quadrics and UDP network paths.
Date: January 1, 2003
Creator: Aulwes, R. T. (Robbie T.); Daniel, D. J. (David J.); Desai, N. N. (Nehal N.); Graham, R. L. (Richard L.); Risinger, L. D. (Larrd Dean); Sukalski, M. W. (Mitchel W.) et al.
Partner: UNT Libraries Government Documents Department