431 Matching Results

Search Results

Supercomputer debugging workshop 1991 proceedings

Description: This report discusses the following topics on supercomputer debugging: Distributed debugging; use interface to debugging tools and standards; debugging optimized codes; debugging parallel codes; and debugger performance and interface as analysis tools. (LSP)
Date: January 1, 1991
Creator: Brown, J.
Partner: UNT Libraries Government Documents Department

NA-NET numerical analysis net

Description: This report describes a facility called NA-NET created to allow numerical analysts (na) an easy method of communicating with one another. The main advantage of the NA-NET is uniformity of addressing. All mail is addressed to the Internet host na-net.ornl.gov'' at Oak Ridge National Laboratory. Hence, members of the NA-NET do not need to remember complicated addresses or even where a member is currently located. As long as moving members change their e-mail address in the NA-NET everything works smoothly. The NA-NET system is currently located at Oak Ridge National Laboratory. It is running on the same machine that serves netlib. Netlib is a separate facility that distributes mathematical software via electronic mail. For more information on netlib consult, or send the one-line message send index'' to netlib{at}ornl.gov. The following report describes the current NA-NET system from both a user's perspective and from an implementation perspective. Currently, there are over 2100 members in the NA-NET. An average of 110 mail messages pass through this facility daily.
Date: December 1, 1991
Creator: Dongarra, J. (Tennessee Univ., Knoxville, TN (United States). Dept. of Computer Science Oak Ridge National Lab., TN (United States)) & Rosener, B. (Tennessee Univ., Knoxville, TN (United States). Dept. of Computer Science)
Partner: UNT Libraries Government Documents Department

NA-NET numerical analysis net

Description: This report describes a facility called NA-NET created to allow numerical analysts (na) an easy method of communicating with one another. The main advantage of the NA-NET is uniformity of addressing. All mail is addressed to the Internet host ``na-net.ornl.gov`` at Oak Ridge National Laboratory. Hence, members of the NA-NET do not need to remember complicated addresses or even where a member is currently located. As long as moving members change their e-mail address in the NA-NET everything works smoothly. The NA-NET system is currently located at Oak Ridge National Laboratory. It is running on the same machine that serves netlib. Netlib is a separate facility that distributes mathematical software via electronic mail. For more information on netlib consult, or send the one-line message ``send index`` to netlib{at}ornl.gov. The following report describes the current NA-NET system from both a user`s perspective and from an implementation perspective. Currently, there are over 2100 members in the NA-NET. An average of 110 mail messages pass through this facility daily.
Date: December 1, 1991
Creator: Dongarra, J. & Rosener, B.
Partner: UNT Libraries Government Documents Department

A parallel algorithm for the non-symmetric eigenvalue problem

Description: This paper describes a parallel algorithm for computing the eigenvalues and eigenvectors of a non-symmetric matrix. The algorithm is based on a divide-and-conquer procedure and uses an iterative refinement technique.
Date: December 1, 1991
Creator: Dongarra, J. & Sidani, M. (Tennessee Univ., Knoxville, TN (United States). Dept. of Computer Science Oak Ridge National Lab., TN (United States))
Partner: UNT Libraries Government Documents Department

Supercomputer debugging workshop 1991 proceedings

Description: This report discusses the following topics on supercomputer debugging: Distributed debugging; use interface to debugging tools and standards; debugging optimized codes; debugging parallel codes; and debugger performance and interface as analysis tools. (LSP)
Date: December 31, 1991
Creator: Brown, J.
Partner: UNT Libraries Government Documents Department

Reduction to condensed form for the eigenvalue problem on distributed memory architectures

Description: In this paper, we describe a parallel implementation for the reduction of general and symmetric matrices to Hessenberg and tridiagonal form, respectively. The methods are based on LAPACK sequential codes and use a panel-wrapped mapping of matrices to nodes. Results from experiments on the Intel Touchstone Delta are given.
Date: January 1, 1992
Creator: Dongarra, J.J. (Tennessee Univ., Knoxville, TN (United States). Dept. of Computer Science Oak Ridge National Lab., TN (United States). Mathematical Sciences Section) & van de Geijn, R.A. (Texas Univ., Austin, TX (United States). Dept. of Computer Sciences)
Partner: UNT Libraries Government Documents Department

Reduction to condensed form for the eigenvalue problem on distributed memory architectures

Description: In this paper, we describe a parallel implementation for the reduction of general and symmetric matrices to Hessenberg and tridiagonal form, respectively. The methods are based on LAPACK sequential codes and use a panel-wrapped mapping of matrices to nodes. Results from experiments on the Intel Touchstone Delta are given.
Date: January 1, 1992
Creator: Dongarra, J. J. & van de Geijn, R. A.
Partner: UNT Libraries Government Documents Department

The hierarchical spatial decomposition of three-dimensional particle- in-cell plasma simulations on MIMD distributed memory multiprocessors

Description: The hierarchical spatial decomposition method is a promising approach to decomposing the particles and computational grid in parallel particle-in-cell application codes, since it is able to maintain approximate dynamic load balance while keeping communication costs low. In this paper we investigate issues in implementing a hierarchical spatial decomposition on a hypercube multiprocessor. Particular attention is focused on the communication needed to update guard ring data, and on the load balancing method. The hierarchical approach is compared with other dynamic load balancing schemes.
Date: July 1, 1992
Creator: Walker, D.W.
Partner: UNT Libraries Government Documents Department

The hierarchical spatial decomposition of three-dimensional particle- in-cell plasma simulations on MIMD distributed memory multiprocessors

Description: The hierarchical spatial decomposition method is a promising approach to decomposing the particles and computational grid in parallel particle-in-cell application codes, since it is able to maintain approximate dynamic load balance while keeping communication costs low. In this paper we investigate issues in implementing a hierarchical spatial decomposition on a hypercube multiprocessor. Particular attention is focused on the communication needed to update guard ring data, and on the load balancing method. The hierarchical approach is compared with other dynamic load balancing schemes.
Date: July 1, 1992
Creator: Walker, D. W.
Partner: UNT Libraries Government Documents Department

Speedup properties of phases in the execution profile of distributed parallel programs

Description: The execution profile of a distributed-memory parallel program specifies the number of busy processors as a function of time. Periods of homogeneous processor utilization are manifested in many execution profiles. These periods can usually be correlated with the algorithms implemented in the underlying parallel code. Three families of methods for smoothing execution profile data are presented. These approaches simplify the problem of detecting end points of periods of homogeneous utilization. These periods, called phases, are then examined in isolation, and their speedup characteristics are explored. A specific workload executed on an Intel iPSC/860 is used for validation of the techniques described.
Date: August 1, 1992
Creator: Carlson, B. M.; Wagner, T. D.; Dowdy, L. W. & Worley, P. H.
Partner: UNT Libraries Government Documents Department

The KSR1: Experimentation and modeling of poststore

Description: Kendall Square Research introduced the KSR1 system in 1991. The architecture is based on a ring of rings of 64-bit microprocessors. It is a distributed, shared memory system and is scalable. The memory structure is unique and is the key to understanding the system. Different levels of caching eliminates physical memory addressing and leads to the ALLCACHE[trademark] scheme. Since requested data may be found in any of several caches, the initial access time is variable. Once pulled into the local (sub)cache, subsequent access times are fixed and minimal. Thus, the KSR1 is a Cache-Only Memory Architecture (COMA) system.This paper describes experimentation and an analytic model of the KSR1. The focus is on the poststore programmer option. With the poststore option, the programmer can elect to broadcast the updated value of a variable to all processors that might have a copy. This may save time for threads on other processors, but delays the broadcasting thread and places additional traffic on the ring. The specific issue addressed is to determine under what conditions poststore is beneficial. The analytic model and the experimental observations are in good agreement. They indicate that the decision to use poststore depends both on the application and the current system load.
Date: February 1, 1993
Creator: Rosti, E. (Milan Univ. (Italy). Dipt. di Scienze dell'Informazione); Smirni, E.; Wagner, T.D.; Apon, A.W. & Dowdy, L.W. (Vanderbilt Univ., Nashville, TN (United States). Dept. of Computer Science)
Partner: UNT Libraries Government Documents Department

The KSR1: Experimentation and modeling of poststore

Description: Kendall Square Research introduced the KSR1 system in 1991. The architecture is based on a ring of rings of 64-bit microprocessors. It is a distributed, shared memory system and is scalable. The memory structure is unique and is the key to understanding the system. Different levels of caching eliminates physical memory addressing and leads to the ALLCACHE{trademark} scheme. Since requested data may be found in any of several caches, the initial access time is variable. Once pulled into the local (sub)cache, subsequent access times are fixed and minimal. Thus, the KSR1 is a Cache-Only Memory Architecture (COMA) system.This paper describes experimentation and an analytic model of the KSR1. The focus is on the poststore programmer option. With the poststore option, the programmer can elect to broadcast the updated value of a variable to all processors that might have a copy. This may save time for threads on other processors, but delays the broadcasting thread and places additional traffic on the ring. The specific issue addressed is to determine under what conditions poststore is beneficial. The analytic model and the experimental observations are in good agreement. They indicate that the decision to use poststore depends both on the application and the current system load.
Date: February 1, 1993
Creator: Rosti, E.; Smirni, E.; Wagner, T. D.; Apon, A. W. & Dowdy, L. W.
Partner: UNT Libraries Government Documents Department

Engineering Physics and Mathematics Division progress report for period ending December 31, 1992

Description: In this report, our research is described through abstracts of journal articles, technical reports, and presentations organized into sections following the five major operating units in the division: Mathematical Sciences, Intelligent Systems, Nuclear Data and Measurement Analysis, Nuclear Analysis and Shielding, and the Engineering Physics Information Centers. Each section begins with an introduction highlighting honors, awards, and significant research accomplishments in that unit during the reporting period.
Date: May 1, 1993
Creator: Ward, R.C.
Partner: UNT Libraries Government Documents Department

Engineering Physics and Mathematics Division progress report for period ending December 31, 1992

Description: In this report, our research is described through abstracts of journal articles, technical reports, and presentations organized into sections following the five major operating units in the division: Mathematical Sciences, Intelligent Systems, Nuclear Data and Measurement Analysis, Nuclear Analysis and Shielding, and the Engineering Physics Information Centers. Each section begins with an introduction highlighting honors, awards, and significant research accomplishments in that unit during the reporting period.
Date: May 1, 1993
Creator: Ward, R. C.
Partner: UNT Libraries Government Documents Department

Privacy and Security Research Group workshop on network and distributed system security: Proceedings

Description: This report contains papers on the following topics: NREN Security Issues: Policies and Technologies; Layer Wars: Protect the Internet with Network Layer Security; Electronic Commission Management; Workflow 2000 - Electronic Document Authorization in Practice; Security Issues of a UNIX PEM Implementation; Implementing Privacy Enhanced Mail on VMS; Distributed Public Key Certificate Management; Protecting the Integrity of Privacy-enhanced Electronic Mail; Practical Authorization in Large Heterogeneous Distributed Systems; Security Issues in the Truffles File System; Issues surrounding the use of Cryptographic Algorithms and Smart Card Applications; Smart Card Augmentation of Kerberos; and An Overview of the Advanced Smart Card Access Control System. Selected papers were processed separately for inclusion in the Energy Science and Technology Database.
Date: May 1, 1993
Partner: UNT Libraries Government Documents Department

Theory, modeling, and simulation annual report, 1992

Description: This report briefly discusses research on the following topics: development of electronic structure methods; modeling molecular processes in clusters; modeling molecular processes in solution; modeling molecular processes in separations chemistry; modeling interfacial molecular processes; modeling molecular processes in the atmosphere; methods for periodic calculations on solids; chemistry and physics of minerals; graphical user interfaces for computational chemistry codes; visualization and analysis of molecular simulations; integrated computational chemistry environment; and benchmark computations.
Date: May 1, 1993
Partner: UNT Libraries Government Documents Department

Analyzing PICL trace data with MEDEA

Description: Execution traces and performance statistics can be collected for parallel applications on a variety of multiprocessor platforms by using the Portable Instrumented Communication Library (PICL). The static and dynamic performance characteristics of performance data can be analyzed easily and effectively with the facilities provided within the MEasurements Description Evaluation and Analysis tool (MEDEA). This report describes the integration of the PICL trace file format into MEDEA. A case study is then outlined that uses PICL and MEDEA to characterize the performance of a parallel benchmark code executed on different hardware platforms and using different parallel algorithms and communication protocols.
Date: November 1, 1993
Creator: Merlo, A. P. & Worley, P. H.
Partner: UNT Libraries Government Documents Department

Parallel algorithms for the spectral transform method

Description: The spectral transform method is a standard numerical technique for solving partial differential equations on a sphere and is widely used in atmospheric circulation models. Recent research has identified several promising algorithms for implementing this method on massively parallel computers; however, no detailed comparison of the different algorithms has previously been attempted. In this paper, we describe these different parallel algorithms and report on computational experiments that we have conducted to evaluate their efficiency on parallel computers. The experiments used a testbed code that solves the nonlinear shallow water equations or a sphere; considerable care was taken to ensure that the experiments provide a fair comparison of the different algorithms and that the results are relevant to global models. We focus on hypercube- and mesh-connected multicomputers with cut-through routing, such as the Intel iPSC/860, DELTA, and Paragon, and the nCUBE/2, but also indicate how the results extend to other parallel computer architectures. The results of this study are relevant not only to the spectral transform method but also to multidimensional FFTs and other parallel transforms.
Date: April 1, 1994
Creator: Foster, I. T. & Worley, P. H.
Partner: UNT Libraries Government Documents Department

Paradigms and strategies for scientific computing on distributed memory concurrent computers

Description: In this work we examine recent advances in parallel languages and abstractions that have the potential for improving the programmability and maintainability of large-scale, parallel, scientific applications running on high performance architectures and networks. This paper focuses on Fortran M, a set of extensions to Fortran 77 that supports the modular design of message-passing programs. We describe the Fortran M implementation of a particle-in-cell (PIC) plasma simulation application, and discuss issues in the optimization of the code. The use of two other methodologies for parallelizing the PIC application are considered. The first is based on the shared object abstraction as embodied in the Orca language. The second approach is the Split-C language. In Fortran M, Orca, and Split-C the ability of the programmer to control the granularity of communication is important is designing an efficient implementation.
Date: June 1, 1994
Creator: Foster, I. T. & Walker, D. W.
Partner: UNT Libraries Government Documents Department

A note on the total least squares problem for coplanar points

Description: The Total Least Squares (TLS) fit to the points (x{sub k}, y{sub k}), k = 1, {hor_ellipsis}, n, minimizes the sum of the squares of the perpendicular distances from the points to the line. This sum is the TLS error, and minimizing its magnitude is appropriate if x{sub k} and y{sub k} are uncertain. A priori formulas for the TLS fit and TLS error to coplanar points were originally derived by Pearson, and they are expressed in terms of the mean, standard deviation and correlation coefficient of the data. In this note, these TLS formulas are derived in a more elementary fashion. The TLS fit is obtained via the ordinary least squares problem and the algebraic properties of complex numbers. The TLS error is formulated in terms of the triangle inequality for complex numbers.
Date: September 1, 1994
Creator: Lee, S. L.
Partner: UNT Libraries Government Documents Department

DOLIB: Distributed Object Library

Description: This report describes the use and implementation of DOLIB (Distributed Object Library), a library of routines that emulates global or virtual shared memory on Intel multiprocessor systems. Access to a distributed global array is through explicit calls to gather and scatter. Advantages of using DOLIB include: dynamic allocation and freeing of huge (gigabyte) distributed arrays, both C and FORTRAN callable interfaces, and the ability to mix shared-memory and message-passing programming models for ease of use and optimal performance. DOLIB is independent of language and compiler extensions and requires no special operating system support. DOLIB also supports automatic caching of read-only data for high performance. The virtual shared memory support provided in DOLIB is well suited for implementing Lagrangian particle tracking techniques. We have also used DOLIB to create DONIO (Distributed Object Network I/O Library), which obtains over a 10-fold improvement in disk I/O performance on the Intel Paragon.
Date: October 1, 1994
Creator: D`Azevedo, E. F. & Romine, C. H.
Partner: UNT Libraries Government Documents Department

Bounds for departure from normality and the Frobenius norm of matrix eigenvalues

Description: New lower and upper bounds for the departure from normality and the Frobenius norm of the eigenvalues of a matrix axe given. The significant properties of these bounds axe also described. For example, the upper bound for matrix eigenvalues improves upon the one derived by Kress, de Vries and Wegmann in [Lin. Alg. Appl., 8 (1974), pp. 109-120]. The upper bound for departure from normality is sharp for any matrix whose eigenvalues are collinear in the complex plane. Moreover, the latter formula is a practical estimate that costs (at most) 2m multiplications, where m is the number of nonzeros in the matrix. In terms of applications, the results can be used to bound from above the sensitivity of eigenvalues to matrix perturbations or bound from below the condition number of the eigenbasis of a matrix.
Date: December 1, 1994
Creator: Lee, S. L.
Partner: UNT Libraries Government Documents Department

A new shared-memory programming paradigm for molecular dynamics simulations on the Intel Paragon

Description: This report describes the use of shared memory emulation with DOLIB (Distributed Object Library) to simplify parallel programming on the Intel Paragon. A molecular dynamics application is used as an example to illustrate the use of the DOLIB shared memory library. SOTON-PAR, a parallel molecular dynamics code with explicit message-passing using a Lennard-Jones 6-12 potential, is rewritten using DOLIB primitives. The resulting code has no explicit message primitives and resembles a serial code. The new code can perform dynamic load balancing and achieves better performance than the original parallel code with explicit message-passing.
Date: December 1, 1994
Creator: D`Azevedo, E.F. & Romine, C.H.
Partner: UNT Libraries Government Documents Department