Efficient biased random bit generation for parallel processing Page: 73 of 101
This thesis or dissertation is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to UNT Digital Library by the UNT Libraries Government Documents Department.
Extracted Text
The following text was automatically extracted from the image on this page using optical character recognition software:
CHAPTER 4. PARALLEL PLATFORMS AND RESULTS
the team slaves, do not idle during a master block, they continue to execute the code. In
fact, the team slaves execute the code without waiting for the end of the master block. This
is an unacceptable situation, as the team slaves might reach the end of the code and cause
the program to terminate-note that program termination occurs when any of the team
members terminate. Also, the team slaves might continue without having the proper data
because the end of the master block may not have been reached. PCP provides a primitive
that allows the user to enforce a synchronization upon the processors that ensures the team
slaves won't inadvertently continue to execute the code. Synchronization is performed with
the barrier primitive. A barrier requires that all team members wait at the barrier until
the preceding work has been completed. Once all the processors, active or not, have arrived
at the barrier, the team members can continue executing.
The sections of code which can be executed in parallel are usually those marked by
loops. In PCP, the parallel loop construct is the forall loop. The forall loop divides the
passes of the loop amongst the team members by interleaving the loop indices within the
team. This is known as fine-grained parallelism. A barrier primitive is normally placed
at the end of the forall loop to ensure that each team member has finished executing its
portion of the loop before the next section of code is executed.
It may be noted that since the indices of the forall loop are interleaved, the value of
a shared variable, such as the accumulator that holds the aggregate averaged value for a
particular lattice site, is not deterministic. This is due to the fact that the individual team
members may access the shared variable in any order and at any time. Just as correct
program execution is ensured by the use of the barrier synchronization primitive, correct
shared variable evaluation is ensured by the use of the lock and unlock primitives. These
two primitives are used to maintain single access to shared variables or critical sections of the
code. Each time a processor accesses a critical section, it sets the lock for that section-no
other team member can bypass the lock and gain access to the shared variable-and unlocks
the critical section when access is no longer required. Once the critical section is unlocked,
the other requesting processors compete to determine which one has the next access. The
operating system, not PCP, ensures that each team member has an equal chance of gaining
access and that every processor will, at some point, get access to the shared variable.
PCP includes functions that give the status of the team being used. That is, how many61
Upcoming Pages
Here’s what’s next.
Search Inside
This document can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Thesis Or Dissertation.
Slone, D.M. Efficient biased random bit generation for parallel processing, thesis or dissertation, September 28, 1994; California. (https://digital.library.unt.edu/ark:/67531/metadc625763/m1/73/: accessed April 24, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.