Development of a dual serial-parallel multiphase CFD code for application to industrial combustor/reactor systems. Page: 7 of 9
This article is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to Digital Library by the UNT Libraries Government Documents Department.
The following text was automatically extracted from the image on this page using optical character recognition software:
workstation that does not have an MPI library available. These tie-off functions are collected in a small module that
can be simply included in the compilation and executable module build process on a single processor machine.
An incremental approach to conversion to a fully parallel code using MPI was chosen to allow for continuing
development during the conversion process. The first step in adding parallel processing functionality was to make
the code fully functional on an arbitrary number of processors, including one, with a parallel core linear equation
solver. About 70 percent of computation time is spent in the core solver, so some significant speed up was expected
with this parallel functionality on machines with 2 to about 16 processors. Problem size is however still limited at
this stage because the entire grid still must fit on processor zero. The grid is partitioned in the axial flow (X-)
direction, and data is passed out to each of P processors by processor 0 at the start of the inner iteration. The inner
iteration employs an alternating direction implicit (ADI) scheme within each processor for the Z- and Y-directions.
This is embedded in a multi-colored Gauss-Siedel sweep which involves neighbor-neighbor data exchanges in the X-
direction. Future versions will incorporate decomposition in each space dimension.
The overall structure of the code with the parallelized core solver is shown in Figure 2. Processor Zero handles
all input/output functions, the overall iteration control and convergence checking functions, and the computation of
coefficients and source terms for each of the linearized PDEs that the low level solver is called to solve. The low
level solver starts up on all processors. The partitions of the coefficient matrix and constant vector are packed in a
buffer and broadcast from processor zero to the other processors via a binary fan-out. Each processor solves its
partition of the problem interatively, swapping ghost cell information after each sweep over its portion of the
decomposed domain. The solution vector, which is an improved estimate of the solution of one coupled PDE in the
set of PDEs to be solved, is then communicated back to processor zero in a gather data procedure via a binary fan-in
. The outer control loop on processor zero, uses the updated solution of each PDE to compute updated
coefficients and constant terms of the linearized discrete equations of the other PDEs in the system to be solved. It
also checks for global convergence of all PDEs, and if the convergence criteria are met or a specified number of
global iterations has been exceeded, it outputs results for post processing (plotting, etc.), convergence information,
and a restart file that allows the global solution algorithm to resume exactly where it left off.
Initialize M PI Process 0?RedIptFe
Valid n oe co ute coef s a etc. Compute coe n os. etc.
t ho r tinuou s ase for discrete rome
fan out data
swapping ghost cells
Figure 2: Simplified Flowchart Showing Implementation of the Parallel Solver
PARALLEL CODE VALIDATION
Validation of the parallel code is accomplished by using a restart capability in combination with a highly
converged problem solution of the serial version of the code. At each point in the parallelization process where the
parallel version can be run from the restart file, however inefficiently, the parallel version is restarted from the serial
version restart file. If the partially converted version iterates at the highly converged solution without diverging from
it, then it is considered to duplicate the numerical algorithms of the serial version with a high level of confidence. As
a final check at major milestones, the parallel version is started from the initial guess for the same problem and
convergence to the same solution within a very small tolerance is verified. The tolerance for 64 bit floating point
calculations is 10-12.
Here’s what’s next.
This article can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Article.
Lottes, S. A.; Fischer, P. F. & Chang, S. L. Development of a dual serial-parallel multiphase CFD code for application to industrial combustor/reactor systems., article, May 16, 2000; Illinois. (https://digital.library.unt.edu/ark:/67531/metadc712529/m1/7/: accessed May 22, 2019), University of North Texas Libraries, Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.