Comparative Study of Message Passing and Shared Memory Parallel Programming Models in Neural Network Training

Vitela, J.; Gordillo, J.; Cortina, L; Hanebutte, U.

You Are Here:
University Libraries
UNT Digital Library
UNT Libraries Government Documents Department
This Article
Page: 4

Comparative Study of Message Passing and Shared Memory Parallel Programming Models in Neural Network Training Page: 4 of 8

592 Kilobytes pages

This article is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to UNT Digital Library by the UNT Libraries Government Documents Department.

View a full description of this article.

Previous search

Adjust Image
Rotate Left
Rotate Right
Brightness, Contrast, etc. (Experimental)
Cropping Tool
Download Sizes
Preview all sizes/dimensions or...
Download Thumbnail
Download Small
Download Medium
Download Large
High Resolution Files
IIIF Image JSON
IIIF Image URL
Accessibility
View Extracted Text

zoom Next

These controls are experimental and have not yet been optimized for user experience.

brightness

Reset Brightness 0

contrast

Reset Contrast 0

saturation

Reset Saturation 0

sharpen

Reset Sharpness 0

exposure

Reset Exposure 0

hue

Reset Hue 0

gamma

Reset Gama 0

Applying filters

Comparative Study of Message Passing and Shared Memory Parallel Programming Models in Neural Network Training

[Sequence #]: 4 of 8

Previous item Next item

Extracted Text

The following text was automatically extracted from the image on this page using optical character recognition software:

to message passing, due to the appearance of scalable
shared memory multiprocessors platforms with em-
bedded hardware support for cache coherence [Culler
et al. 1999].
In this work we present a comparative performance
study, utilizing a neural network training code, which
has been implemented in both OpenMP and MPI.
In addition, the OpenMP and MPI versions of the
parallel training code are further compared to an im-
plementation utilizing the native SGI/CRAY envi-
ronment for shared memory progranuning, SHMEM
[Feind 1995]. In what follows we briefly describe the
main characteristics of these environments.
" OpenMP is a standard developed for shared
memory multiprocessor computers. In these
platforms every processor has direct access to
the memory of every other processor which al-
lows them to directly load or store any shared
address. It is, in essence, a set of compiler di-
rectives to express shared-memory parallelism;
it allows incremental parallelization of exist-
ing sequential codes and can be used for loop-
level and for coarse grained parallelism. It al-
lows the progranuner the possibility of declar-
ing any pieces of memory as private to each
processor which greatly simplifies the develop-
ment of parallel programs. Further, it may be
used by application programmers who want a.
quick but not very effective parallelization. It is
the general consensus among researchers in this
field that OpenMP environment is more relevant
when used in: codes with large shared databases,
which needed to be stored only once per node in
OpenMP rather than once per processor in MPI;
codes with tasks that actually can benefit from
loop-level parallelism in addition to domain-level
parallelism; and in MPI-limited facilities.
" MPI (Message Passing Interface), also an in-
dustrial standard, is a message passing environ-
ment which assumes a processor cluster with
distributed memory able to work cooperatively.
This set of processors runs concurrently copies
of a. single program and use MPI library calls for
sending and receiving messages between proces-
sors as well as for tasks synchronization. How-
ever, MPI which is intended mainly for coarse
grained parallelism, has the disadvantage that
the program must be entirely decomposed for
parallel execution, and there exist no incremen-
tal way to parallelize an application. Neverthe-
less, in addition to be an industrial standard,
another major advantage consist in the fact that

it can be used efficiently in a multiprocessor
computer or in a cluster of workstations , and
furthermore, it can coexists with OpenMP and
SHMEM. Hence, it allows the possibility of de-
veloping efficient parallel programs able to run in
clusters of shared memory multiprocessors com-
puters (i.e. SMP's), using OpenMP within each
individual system and MPI whenever inter-SMP
communication is required.
" SHMEM, on the other hand, is a SGI/CRAY
native set of routines that take advantage of
the logically shared memory in systems such as
the Origin2000 and the CRAY T3D. A logically
shared memory is one which allows any proces-
sor unit in a multiprocessor platform to access
the memory of any the other processor without
the direct involvement of this later unit. It con-
sists of data passing library routines, similar to
those used in message passing, designed to maxi-
mize bandwidth and minimize data latency, thus
minimizing the overall computation overhead of
data transfer requests. It is intended exclusively
for coarse grained parallelism and although, in
contrast with MPI, it contains only a limited
number of routines, these are enough for a large
number of different applications.
In order to compare these three environments, we
limit our study to coarse-grained parallelism which is
based on the domain decomposition approach. In this
technique, the parallel code goes through essentially
the same steps as the sequential code and use a set of
parallel directives to perform data transfers and syn-
chronization between processors. The study exploits
a neural network training code developed for the con-
trol of dynamical systems which uses Radial Basis
Neural Networks (RBNN's) are an alternative type
of ANN possessing the best representation property
[Pogio et al. 1990], have higher convergence speeds
than the conventional feedforward multilayer neural
networks, and under some conditions in the train-
ing set they are also free of local minima. This code
makes use of RBNN's composed of Gaussian nodes in
the hidden layer and siginoidal units in the output.
The physical system under consideration is a zero
dimensional tokamak fusion reactor model with the
design parameters of the ITER-EDA group. It is de-
sired to stabilize this system at subignited nominal
operation conditions for a wide range of energy con-
finement times. The plasma is composed by 50:50
DT, helium ions, a small fraction of high-Z impuri-
ties and electrons, and it is assumed that all particles
share the same temperature at all times. Heating

Upcoming Pages

Here’s what’s next.

5 of 8

6 of 8

7 of 8

8 of 8

Show all pages in this article.

Search Inside

This article can be searched. Note: Results may vary based on the legibility of text within the document.

or search this site for other articles

Tools / Downloads

Get a copy of this page or view the extracted text.

Preview all sizes/dimensions or...

Download Thumbnail
Download Small
Download Medium
Download Large
IIIF Image JSON
IIIF Image

View Extracted (OCR) Text

Citing and Sharing

Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.

Reference the current page of this Article.

Vitela, J.; Gordillo, J.; Cortina, L & Hanebutte, U. Comparative Study of Message Passing and Shared Memory Parallel Programming Models in Neural Network Training, article, December 14, 1999; California. (https://digital.library.unt.edu/ark:/67531/metadc723216/m1/4/: accessed April 24, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.

Comparative Study of Message Passing and Shared Memory Parallel Programming Models in Neural Network Training Page: 4 of 8

Upcoming Pages

Search Inside

Tools / Downloads

Citing and Sharing

Reference the current page of this Article.

Print / Share This Page

Permanent URL (This Page)

Univesal Viewer

International Image Interoperability Framework (This Page)