Language constructs for modular parallel programs Page: 3 of 15
This report is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to Digital Library by the UNT Libraries Government Documents Department.
Figure 1: Convolution Algorithm Structure
In a sequential implementation, stepwise refinement may identify FFT, matrix multi-
plication, input, and output modules. The data structures to be input to and output from
each module are specified in its interface; internal data structures and algorithmic details
are encapsulated and not visible to other components. The interface specifies the size and
perhaps the type of the data structures to be operated on, hence allowing the module to
be used in different contexts. For example, the same FFT module can be called three
times in the convolution example. The interface design may also address efficiency issues,
for example by specifying that data is to be passed by reference rather than by value.
The same decomposition can be used in a parallel implementation. Now, however, we
need to encapsulate not only data and computation but also concurrency, communication,
process mapping, and data distribution. If we do not encapsulate this information, two
modules mapped to the same processor might interfere, for example if one received mes-
sages intended for the other. When composing two modules, we prefer not to be aware
of how each maps computation and data. (The mappings used in different modules can
be made conformant to improve performance, but should not affect correctness.) Again,
efficiency issues need to be addressed in the interface design; for example, by allowing for
the exchange of distributed data structures between parallel modules.
In a sequential implementation, modular decomposition may identify the representa-
tion of matrices as common to several program components. Hence, we define a module
that provides the required matrix operations: read an element, write an element, etc.
The rest of the program uses these functions to access matrices; alternative matrix rep-
resentations (e.g., linear storage or a linked-list representation of sparse matrices) can be
substituted without changes to other components.
A design decision of equal significance in a parallel implementation is how computation
and data are mapped to processors. For example, the various phases of the convolution
problem illustrated in Figure 1 can be organized so as to execute (a) on disjoint sets of
processors as a pipeline, (b) in sequence for each input data set, or (c) as some hybrid
of these two approaches (Figure 2). We wish to localize the changes required to explore
these alternatives, which do not affect the basic structure of the algorithm but may have
different performance characteristics. These changes concern the allocation of computa-
tional resources (processors) to modules and the scheduling of components mapped to the
Here’s what’s next.
This report can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Reference the current page of this Report.
Foster, I. Language constructs for modular parallel programs, report, March 1, 1996; Illinois. (https://digital.library.unt.edu/ark:/67531/metadc667430/m1/3/: accessed April 20, 2019), University of North Texas Libraries, Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.