Enhancing Scalability of Sparse Direct Methods Page: 2 of 5
This article is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to Digital Library by the UNT Libraries Government Documents Department.
The following text was automatically extracted from the image on this page using optical character recognition software:
Name Codes Type Order (N) nnz(A)/N Fill-ratio
matrix181 M3D-C1 Real 589,698 161 9.3
matrix211 M3D-C1 Real 801,378 161 9.3
cclinear2 NIMROD Complex 259,203 109 7.5
dds15 Omega3P Real 834,575 16 40.2
Table 1. Characteristics of the sample matrices. The sparsity is measured as average number
of nonzeros per row (i.e., nnz(A)/N), and the Fill-ratio shows the ratio of number of nonzeros
in L+U over that in A. Here, MdIiS is used to reorder the equations to reduce fill.
180 -4-cc linear
IBM powers processors
(a) Factorization time.
3 -Ill-cc linear2
IBM powers processors
(b) Triangular solution time.
Figure 1. SuperLU runtime (seconds) for the linear systems from the SciDAC applications.
This was done on the IBM Power 5 machine at NERSC. The factorization reached 161 Gflops/s
flop rate for matrix211.
dimensions can be tens to hundreds of millions. The main method used is shift-invert Lanczos,
for which the shifted linear systems are solved with a combination of direct and iterative methods.
1.3. SuperLU efficiency with these applications
SuperLU  is a leading scalable solver for sparse linear systems using direct methods, of which
the development is mainly funded through the TOPS SciDAC project (led by David Keyes) .
Table 1 shows the characteristics of a few typical matrices taken from these simulation codes.
Figure 1 shows the parallel runtime of the two important phases of SuperLU: factorization and
triangular solution. The experiments were performed on an IBM Power 5 parallel machine at
NERSC. In strong scaling sense, the factorization routine scales very well, although performance
varies with applications. The triangular solution takes very small fraction of the total time. On
the other hand, it does not scale as well as factorization, mainly due to large communication to
computation ratio and higher degree of sequential dependencies. One of our future tasks is to
improve scalability of this phase, since in these application codes, the triangular solution often
needs to be done several times with respect to one factorization.
In the last year or so, we have been focusing on developing new algorithms to enhance
scalability of our direct solvers. The new results are summarized in the next two sections.
2. Improving memory scalability of SuperLU parallelizing symbolic factorization
Symbolic factorization is a phase to determine the nonzero locations of the L. U factors. In most
parallel sparse direct solvers, this phase is performed in serial, with matrix A being available on
Here’s what’s next.
This article can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Article.
Li, Xiaoye S.; Demmel, James; Grigori, Laura; Gu, Ming; Xia,Jianlin; Jardin, Steve et al. Enhancing Scalability of Sparse Direct Methods, article, July 23, 2007; Berkeley, California. (digital.library.unt.edu/ark:/67531/metadc896087/m1/2/: accessed January 21, 2019), University of North Texas Libraries, Digital Library, digital.library.unt.edu; crediting UNT Libraries Government Documents Department.