Synchrotron-based high-pressure research in materials science Page: 2 of 9
This article is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to UNT Digital Library by the UNT Libraries Government Documents Department.
Extracted Text
The following text was automatically extracted from the image on this page using optical character recognition software:
An Empirical Performance Analysis of Commodity
Memories in Commodity Servers.Darren J. Kerbyson, Mike Lang
Performance and Architecture Lab (PAL)
Los Alamos National Laboratory
NM 87545
+1 (505) 667 4913
{djk,mlang}@lanl.gov
"The difference between false memories and true ones
is.the same as for jewels: it is always the false ones that
look the most real, the most brilliant."
- Salvador Dali
ABSTRACT
This work details a performance study of six different commodity
memories in two commodity server nodes on a number of micro-
benchmarks, that measure low-level performance characteristics,
as well as on two applications representative of the ASCI
workload. The memories vary both in terms of performance,
including latency and bandwidths, and also in terms of their
physical properties and manufacturer. Two server nodes were
used; one Itanium-I1 Madison based system, and one Xeon based
system. All the memories examined can be used within both
processing nodes. This allows the performance of the memories to
be directly examined while keeping all other factors within a
processing node the same (processor, motherboard, operating
system etc.). The results of this study show that there can be a
significant difference in application performance from the
different memories - by as much as 20%. Thus, by choosing the
most appropriate memory for a processing node at a minimal cost
differential, significant improved performance may be achievable.
Categories and Subject Descriptors
C.4 [Performance of Systems]: Measurement techniques and
Performance attributes.
General Terms
Performance
Keywords
Memory System Performance, Memory Modules, Performance
Measurement, Performance Analysis.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
Memory System Performance'04, June 8, 2004, Washington, DC, USA.
Copyright 2004 ACM 1-58113-000-0/00/0004...$5.00.*Gene Patino
SMART Modular Technologies Inc.
15635 Alton Parkway
Irvine, CA 92618
+1 (949) 753 0116 ext. 129
Gene.Patino@smartm.com
1. INTRODUCTION
System memory typically is a large component in the cost of any
computer. purchase. However, its specification is usually distilled
down to just a size in giga-bytes per processor or node. This
investigation dhows that a closer look at memory is necessary in
ordetto-achiee a higher overall system performance. There has
been much in the way of performance analysis of commodity
based cluster systems but little that analyses the differences in
commodity memories. The differential in cost between the
memories is minimal while, as we will show, the performance
differential can amount to 10's of percent.
The characteristics of commodity memories include:
Bandwidth - the bus speed of the processing node determines the
bandwidth that a memory module can operate at.
Latency - the latency to the main memory that can be achieved
from the processing node. Note that there are several
components to the latency including: CL latency (or CAS
Column-Address-Strobe Latency), the Row Precharge Time
(tRP), and the Row Address to Column Address Delay
(tRCD).
Packaging - memory modules vary in physical dimensions, and in
DRAM IC packaging (including TSOP and BGA packages).
Manufacturer - several manufacturers have a significant fraction
of the DRAM market.
When considering the performance of a memory, the module is
typically referred to by its bus speed (e.g. PC2100 or 266MHz),
CL latency, tRP and tRCD, for example a PC2100 CL2.0-2-2
memory works on a 266MHz bus and has a CL latency or 2.0 with
a tRP of 2 cycles and a tRCD of 2 cycles. One would expect that
the higher performing memories have a higher rated bandwidth
and lower rated latency. As we will show, this is not necessarily
the case.
Two nodes are used in this work - a Dell PowerEdge 2650 server
containing two Intel 2.8-GHz Xeon processors and a Dell
PowerEdge 3250 server containing two Itanium-II 1.3-GHz
Madison processors. Both are popular in the construction of high-
performance clusters. The Dell 2650 uses the ServerWorks GC-
LE chipset, and the Dell 3250 uses the Intel E8870 chipset.
The memory modules that were made available for testing are
listed in Table 1. The memories are ordered in terms of their CL-
tRP-tRCD latencies. As can be seen, the memories differ in terms
of their manufacturer, packaging, and physical dimensions. The
first four memory modules were supplied by Smart Modular
Upcoming Pages
Here’s what’s next.
Search Inside
This article can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Article.
Synchrotron-based high-pressure research in materials science, article, Date Unknown; [Los Alamos, New Mexico]. (https://digital.library.unt.edu/ark:/67531/metadc929409/m1/2/: accessed April 25, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.