Scheduling in Heterogeneous Grid Environments: The Effects of DataMigration Page: 2 of 8
This article is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to Digital Library by the UNT Libraries Government Documents Department.
The following text was automatically extracted from the image on this page using optical character recognition software:
XSufferage , by Casanova et al., also considers data
location when scheduling tasks to exploit reuse. Com-
pared with these techniques, our focus is quite different.
Instead of investigating intelligent ways to reuse data,
our objective is to understand the conditions under which
the data transfer overhead must be considered in dis-
tributed job scheduling, and how the advantages of a grid
environment are affected by variations in this parameter.
We therefore assume no data reuse. All input files are
resident on the host where the job is submitted, and all
output files must be redirected to the same machine. This
is true for most scientific applications.
We evaluate our grid scheduling algorithms by sim-
ulating compute servers, various groupings of servers
into sites, and inter-server networks, and drive these
simulations using real workloads derived from trace data
gathered from leading supercomputing centers over the
same time period. The experimental methodology is
described in Section IV. We gather several key perfor-
mance metrics, and use them to compare the behavior
of our algorithms against reference local and centralized
scheduling schemes. Our specific objective in this paper
is to understand the effects of data migration.
Our results, presented in Section V, show that our
best scheduling strategy delivers turnaround times that
are 60% smaller than those without grid scheduling,
even in the presence of input/output data migration.
Alternatively, our algorithm can execute 40% more jobs
in the grid environment and deliver the same turnaround
times as in a non-grid scenario. Finally, for large data
files (or slow networks), we find that it is imperative to
consider data transfer times when making job migration
decisions as results show an increase of up to 43x in job
turnaround times if data migration overhead is ignored.
II. GRID SCHEDULING ARCHITECTURE
We use a simple grid scheduling architecture, shown
in Fig. 1, for evaluating our proposed job migration
algorithms. The architecture is composed of distributed
compute servers, local schedulers with local queues, and
grid schedulers with grid queues. A new job is always
submitted to the grid scheduler (GS) of the compute
server with which it has an "affinity" and placed in the
associated grid queue (GQ). The GS then analyzes the
job's resource requirements after gathering local infor-
mation from the corresponding local scheduler (LS) and
remote information from its peer GSs. The LS provides
data about the local queue (LQ) and the local compute
server, while other GSs supply data about remote sites.
Based on all this information, the GS determines whether
to send the job from the GQ to its own LQ or to the
LQ of another server through the grid middleware and
appropriate GS. Once a job is placed in a LQ, the LS
schedules it for execution on the compute server using
Queue -.s.. Local
PE PE PE
Fig. 1. Our grid scheduling architecture (solid arrows represent
movement of jobs, dashed arrows represent transfer of information).
the local scheduling policy. One issue not addressed in
this work is how, in practice, a GS locates other GSs. We
expect to utilize traditional peer-to-peer (P2P) strategies
that use centralized or distributed indices, and plan to
examine it in detail at a later time.
There are other grid scheduling architectures that we
could have adopted. A centralized scheme with a single
scheduler for multiple computer systems might be a good
choice for a relatively small set of servers at a single
site, but the approach does not scale and is not fault
tolerant. A hierarchy of grid schedulers organized into
a tree where jobs flow up and down is an interesting
approach , but we do not expect it to scale as
well as a P2P strategy. A variation of our architecture
combines pairs of local and grid schedulers into a single
scheduler. This is starting to occur as vendors adopt a
grid perspective to scheduling , , but these systems
do not interoperate and are not yet widely used.
Another approach to grid scheduling is where users
employ user-level schedulers to select the compute
servers for submitting their applications . This strategy
is somewhat similar to our P2P method, the difference
being that user-level grid schedulers seek to optimize the
execution of jobs for a single user while our grid sched-
ulers strive to optimize the execution of all jobs. We
believe this distinction results in the P2P grid scheduling
approach having potentially greater overall performance.
In the end, we chose a P2P architecture with a grid
scheduler co-located with each local scheduler. This
strategy  gives us the best scalability, fault tolerance,
and scheduling performance without requiring that sites
replace their local schedulers.
III. GRID SCHEDULING ALGORITHMS
This section presents the three distributed scheduling
algorithms that are the subject of this work, and the two
reference algorithms against which they are compared.
The three distributed algorithms are sender-initiated,
receiver-initiated, and symmetrically-initiated. All three
operate in a P2P manner but use different strategies
Q u eu e ....- - L o cal
PE PE PE
Here’s what’s next.
This article can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Article.
Oliker, Leonid; Biswas, Rupak; Shan, Hongzhang & Smith, Warren. Scheduling in Heterogeneous Grid Environments: The Effects of DataMigration, article, January 1, 2004; Berkeley, California. (https://digital.library.unt.edu/ark:/67531/metadc901613/m1/2/: accessed April 19, 2019), University of North Texas Libraries, Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.