The ACL Message Passing Library Page: 4 of 10
This article is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to Digital Library by the UNT Libraries Government Documents Department.
The following text was automatically extracted from the image on this page using optical character recognition software:
I ~) ~
The Message Passing Interface (MPI) library was
designed with efficiency and portability in mind.
The MPI feature set was designed by committee
which used features and concepts from many various
message passing systems [MPI]. What resulted is a
"full-featured" message passing library that includes
many variations on send and receive
(blockin/nonblocking, buffered/unbuffered, receiver-
ready, different data types including user specified,
and more). Additionally, MPI includes support for
global operations (barriers, reductions, gather/scatters,
broadcasts, scans, etc.), processor topologies, proces-
sor groups, profiling, and error handling. Process
management (creation, deletion, migration), active
messages, and 1/0 support are not included in the
Thinking Machines Corporation created CMMD for
the CM-5 massively parallel computer [CMMD].
CMMD supports three styles of communication: syn-
chronous, asynchronous, and active messages (used
for event driven applications). The library also in-
cludes functions for global operations (reductions,
scans, broadcasts, barriers) and 1/0. CMMD has no
primitives for process control or virtual machine con-
Many other message passing systems exist that pro-
vide similar functionality to these three. PVM, MPI,
and CMMD are of particular interest to us since they
are the "supported" message passing systems for the
T3D and the CM-5.
3. The Need for Performance
Our software efforts are targeted towards high perform-
ance software for MPPs and SMPs (Symmetric
Multi-Processors). Our focus is not on harnessing
the latent power of desktop workstations. Nor is in
running a single program on several supercomputers.
Given this, several key differences should be noted
between PVM, MPI, and CMMD.
PVM is widely available for most unix workstations
and for many common supercomputers and MPPs. It
has many basic communications primitives and
primitives for process management. PVM's main
weakness is that it is not high performance. Past
versions utilize a deamon process on each computer
node which is involved in communications. Recent
versions of PVM allow these deamons to be by-
passed; however, performance is still lacking as will
MPI is a recent message passing system and is not
widely available. MPI includes numerous primitives
(far more than PVM), except for process management.
While efficiency is a main goal for MPI, our bench-
marks on the T3D show that it is lacking as well.
Both PVM and MPI also have the goal of supporting
heterogeneous data types and computers.
CMMD differs from PVM and MPI in that it is not
widely available; however, it is does have a large user
base since it is the only supported message passing
system available on the CM-5. CMMD has sufficient
primitives without trying to include everything. It
has the basic communications primitives as well as
active messages. It also has the most common
CMMD was designed for interprocessor communica-
tions within the CM-5 and not with processes exter-
nal to the MPP. This allows for several optimiza-
tions. The library does not need to communicate
with heterogeneous processors or data types; which
avoids unnecessary data conversion and the need for a
plethora of different prumitives for various data types.
CMMD also takes advantage of the underlying hard-
ware. It makes use of both the data network and the
control network in the CM-5. In particular, the con-
trol network is used in global communications opera-
tions such as reductions and broadcasts.
ACLMPL was developed with similar constraints as
CMMD: message passing within a single multiproc-
essor machine (MPPs and SMPs) and sufficient
primitives without trying to be all encompassing.
ACLMPL is split into two groups: the synchronous
communications primitives and the asynchronous
primitives. On top of the synchronous primitives are
layered the global communications primitives. Split-
ting synchronous and asynchronous primitives into
two separate groups, with no overlap, makes sense.
Layering asynchronous on top of synchronous does
not make sense. Layering synchronous on top of
asynchronous will work, but it introduces additional
overhead (extra function calls, buffering, etc.); and as
the timings will show, synchronous communication
is faster than asynchronous communications. Addi-
tionally, both are faster than the other message pass-
The following sections will describe the implementa-
tion of ACLMPL on the T3D. Later sections will
discuss the differences on the CM-5 and SGI.
The synchronous message passing API in ACLMPL
was implemented first. Synchronous message pass-
ing has some potential performance advantages over
asynchronous methods since there is no need for in-
termediate buffering. Data can be sent directly from
the sender to the receiver with no need for additional
data copying. This can result in much higher band-
Here’s what’s next.
This article can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Article.
Painter, J.; McCormick, P.; Krogh, M.; Hansen, C. & Colin de Verdiere, G. The ACL Message Passing Library, article, September 1, 1995; New Mexico. (https://digital.library.unt.edu/ark:/67531/metadc624883/m1/4/: accessed April 22, 2019), University of North Texas Libraries, Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.