A Testbed of Parallel Kernels for Computer Science Research

PDF Version Also Available for Download.

Description

For several decades, computer scientists have sought guidance on how to evolve architectures, languages, and programming models for optimal performance, efficiency, and productivity. Unfortunately, this guidance is most often taken from the existing software/hardware ecosystem. Architects attempt to provide micro-architectural solutions to improve performance on fixed binaries. Researchers tweak compilers to improve code generation for existing architectures and implementations, and they may invent new programming models for fixed processor and memory architectures and computational algorithms. In today's rapidly evolving world of on-chip parallelism, these isolated and iterative improvements to performance may miss superior solutions in the same way gradient descent ... continued below

Physical Description

4

Creation Information

Bailey, David; Demmel, James; Ibrahim, Khaled; Kaiser, Alex; Koniges, Alice; Madduri, Kamesh et al. April 30, 2010.

Context

This report is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided by UNT Libraries Government Documents Department to Digital Library, a digital repository hosted by the UNT Libraries. More information about this report can be viewed below.

Who

People and organizations associated with either the creation of this report or its content.

Publisher

Provided By

UNT Libraries Government Documents Department

Serving as both a federal and a state depository library, the UNT Libraries Government Documents Department maintains millions of items in a variety of formats. The department is a member of the FDLP Content Partnerships Program and an Affiliated Archive of the National Archives.

Contact Us

What

Descriptive information to help identify this report. Follow the links below to find similar items on the Digital Library.

Description

For several decades, computer scientists have sought guidance on how to evolve architectures, languages, and programming models for optimal performance, efficiency, and productivity. Unfortunately, this guidance is most often taken from the existing software/hardware ecosystem. Architects attempt to provide micro-architectural solutions to improve performance on fixed binaries. Researchers tweak compilers to improve code generation for existing architectures and implementations, and they may invent new programming models for fixed processor and memory architectures and computational algorithms. In today's rapidly evolving world of on-chip parallelism, these isolated and iterative improvements to performance may miss superior solutions in the same way gradient descent optimization techniques may get stuck in local minima. In an initial study, we have developed an alternate approach that, rather than starting with an existing hardware/software solution laced with hidden assumptions, defines the computational problems of interest and invites architects, researchers and programmers to implement novel hardware/ software co-designed solutions. Our work builds on the previous ideas of computational dwarfs, motifs, and parallel patterns by selecting a representative set of essential problems for which we provide: An algorithmic description; scalable problem definition; illustrative reference implementations; verification schemes. For simplicity, we focus initially on the computational problems of interest to the scientific computing community but proclaim the methodology (and perhaps a subset of the problems) as applicable to other communities. We intend to broaden the coverage of this problem space through stronger community involvement. Previous work has established a broad categorization of numerical methods of interest to the scientific computing, in the spirit of the NAS Benchmarks, which pioneered the basic idea of a 'pencil and paper benchmark' in the 1990s. The initial result of the more modern study was the seven dwarfs, which was subsequently extended to 13 motifs. These motifs have already been useful in defining classes of applications for architecture-software studies. However, these broad-brush problem statements often miss the nuance seen in individual kernels. For example, the computational requirements of particle methods vary greatly between the naive (but more accurate) direct calculations and the particle-mesh and particle-tree codes. Thus we commenced our study with an enumeration of problems, but then proceeded by providing not only reference implementations for each problem, but more importantly a mathematical definition that allows one to escape iterative approaches to software/hardware optimization. To ensure long term value, we have augmented each of our reference implementations with both a scalable problem generator and a verification scheme. In a paper we have prepared that documents our efforts, we describe in detail this process of problem definition, scalable input creation, verification, and implementation of reference codes for the scientific computing domain. Table 1 enumerates and describes the level of support we've developed for each kernel. We group these important kernels using the Berkeley dwarfs/motifs taxonomy using a red box in the appropriate column. As kernels become progressively complex, they build upon other, simpler computational methods. We note this dependency via orange boxes. After enumeration of the important numerical problems, we created a domain-appropriate high-level definition of each problem. To ensure future endeavors are not tainted by existing implementations, we specified the problem definition to be independent of both computer architecture and existing programming languages, models, and data types. Then, to provide context as to how such kernels productively map to existing architectures, languages and programming models, we produced reference implementations for most of the kernel (see table). These sample codes should be viewed as 'hints,' designed to show how other designers have mapped a problem's operands and operators to existing hardware and software. Since we wanted such implementations to be illustrative, we tried to ensure they were the most straightforward implementation in the easiest to understand languages using familiar architectures. To that end, most of the linear algebra-oriented computations are written in MATLAB using array indexing to process matrices, rather than one-line library calls to compute the same kernel. This ensures that the kernel's computation is explicit and readable in the implementation and not hidden behind a library. For other problems, such as the Barnes-Hut n-Body solver, the implementations were written in pure C, without any supporting library computations. Along this line, we have also created a scalable problem generator to accompany each computation.

Physical Description

4

Language

Item Type

Identifier

Unique identifying numbers for this report in the Digital Library or other systems.

  • Report No.: LBNL-3401E
  • Grant Number: DE-AC02-05CH11231
  • DOI: 10.2172/983273 | External Link
  • Office of Scientific & Technical Information Report Number: 983273
  • Archival Resource Key: ark:/67531/metadc1014841

Collections

This report is part of the following collection of related materials.

Office of Scientific & Technical Information Technical Reports

Reports, articles and other documents harvested from the Office of Scientific and Technical Information.

Office of Scientific and Technical Information (OSTI) is the Department of Energy (DOE) office that collects, preserves, and disseminates DOE-sponsored research and development (R&D) results that are the outcomes of R&D projects or other funded activities at DOE labs and facilities nationwide and grantees at universities and other institutions.

What responsibilities do I have when using this report?

When

Dates and time periods associated with this report.

Creation Date

  • April 30, 2010

Added to The UNT Digital Library

  • Oct. 14, 2017, 8:36 a.m.

Description Last Updated

  • Oct. 17, 2017, 7 p.m.

Usage Statistics

When was this report last used?

Congratulations! It looks like you are the first person to view this item online.

Interact With This Report

Here are some suggestions for what to do next.

Start Reading

PDF Version Also Available for Download.

Citations, Rights, Re-Use

Bailey, David; Demmel, James; Ibrahim, Khaled; Kaiser, Alex; Koniges, Alice; Madduri, Kamesh et al. A Testbed of Parallel Kernels for Computer Science Research, report, April 30, 2010; Berkeley, California. (digital.library.unt.edu/ark:/67531/metadc1014841/: accessed December 17, 2017), University of North Texas Libraries, Digital Library, digital.library.unt.edu; crediting UNT Libraries Government Documents Department.