DZero data-intensive computing on the Open Science Grid Page: 3 of 8
This article is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to UNT Digital Library by the UNT Libraries Government Documents Department.
Extracted Text
The following text was automatically extracted from the image on this page using optical character recognition software:
As one of its main responsibilities, SAM-Grid implements the selection of computing and storage
resources during workload scheduling. For selection of storages and for distribution of data, SAM-
Grid relies on the Sequential Access via Metadata (SAM) [6], a data handling system responsible for
the management of storage elements and for job-related metadata cataloguing. For selection of
computing resources, SAM-Grid relies on specialized deployments of standard middleware
components, such as Condor-G [7] and the Globus Gatekeeper [8], to distribute the workload between
OSG, LCG, and the other available resources. The SAM-Grid job selection infrastructure interfaces to
grid-specific brokering systems, like the LCG resource broker [9] or the OSG Resource Selection
System (ReSS) [10], for fine-grained intra-grid cluster selection. In addition, the SAM-Grid system
provides interfaces for job management (job submission, job status tracking, job deletion, job failure
recovery, etc.) as well as job workflow monitoring.
4. System Commissioning
The data reprocessing activity started with a ramp up phase of resource commissioning. Out of all the
available computing facilities, on the order of 80 in the OSG only, resources were included in the
system according to a long multi-step process, only partially automatable. First, network connectivity
to storage resources had to be tested for sufficiently high bandwidth and low latency. Second, data
reprocessing jobs were run on reference input data, so that output could be compared with expected
results (site certification). Third, appropriate data transfer queues were created in the SAM system.
This achieved two goals: (1) separating requests for data between sites with low and high network
latencies; (2) shaping network traffic for applications with different data input patterns e.g. data
reprocessing applications vs. data merging applications. Finally, sites with local storages where
preferred, in order to enable caching of the input application.
The following subsections explore in more detail some of these commissioning steps.
4.1. Site Certification
In the early days of the DZero experiments, data reprocessing activities were conducted on a few
closely-controlled computing clusters. As more and more computing resources were made available
through grid interfaces, systems like SAM-Grid made it possible to manage the complexity of running
data intensive activities over large distributed system [11]. The increased diversity of the available
resources, however, made it necessary to verify that the physics results were invariant with respect to
where they were produced. To accomplish this goal, the DZero experiment developed a site
certification process.Entries 1362z7
ETA ..a -. . I I1.
-RMS 1.206-4 -3 -2
-1 0
LTU1 2 3
4000
3500
3000
2500
2000
1500
1000
500'
-4 -3 -ETA
Entries 135624
Mean -0l0676a-1 0 1 2 3 4
DOfarm4000
3500
3000
2500
2000
1500
1000
500n
-M .20
Upcoming Pages
Here’s what’s next.
Search Inside
This article can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Article.
Abbott, B.; U., /Oklahoma; Baranovski, A.; Diesburg, M.; Garzoglio, G.; /Fermilab et al. DZero data-intensive computing on the Open Science Grid, article, September 1, 2007; Batavia, Illinois. (https://digital.library.unt.edu/ark:/67531/metadc886694/m1/3/: accessed April 23, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.