Search Results

Advanced search parameters have been applied.
open access

Using Perturbed QR Factorizations To Solve Linear Least-Squares Problems

Description: We propose and analyze a new tool to help solve sparse linear least-squares problems min{sub x} {parallel}Ax-b{parallel}{sub 2}. Our method is based on a sparse QR factorization of a low-rank perturbation {cflx A} of A. More precisely, we show that the R factor of {cflx A} is an effective preconditioner for the least-squares problem min{sub x} {parallel}Ax-b{parallel}{sub 2}, when solved using LSQR. We propose applications for the new technique. When A is rank deficient we can add rows to ensur… more
Date: March 21, 2008
Creator: Avron, Haim; Ng, Esmond G. & Toledo, Sivan
Partner: UNT Libraries Government Documents Department
open access

A Supernodal Approach to Incomplete LU Factorization with Partial Pivoting

Description: We present a new supernode-based incomplete LU factorization method to construct a preconditioner for solving sparse linear systems with iterative methods. The new algorithm is primarily based on the ILUTP approach by Saad, and we incorporate a number of techniques to improve the robustness and performance of the traditional ILUTP method. These include the new dropping strategies that accommodate the use of supernodal structures in the factored matrix. We present numerical experiments to demons… more
Date: June 25, 2009
Creator: Li, Xiaoye Sherry & Shao, Meiyue
Partner: UNT Libraries Government Documents Department
open access

A new scheduling algorithm for parallel sparse LU factorization with static pivoting

Description: In this paper we present a static scheduling algorithm for parallel sparse LU factorization with static pivoting. The algorithm is divided into mapping and scheduling phases, using the symmetric pruned graphs of L' and U to represent dependencies. The scheduling algorithm is designed for driving the parallel execution of the factorization on a distributed-memory architecture. Experimental results and comparisons with SuperLU{_}DIST are reported after applying this algorithm on real world applic… more
Date: August 20, 2002
Creator: Grigori, Laura & Li, Xiaoye S.
Partner: UNT Libraries Government Documents Department
open access

LUsim: A Framework for Simulation-Based Performance Modelingand Prediction of Parallel Sparse LU Factorization

Description: Sparse parallel factorization is among the most complicated and irregular algorithms to analyze and optimize. Performance depends both on system characteristics such as the floating point rate, the memory hierarchy, and the interconnect performance, as well as input matrix characteristics such as such as the number and location of nonzeros. We present LUsim, a simulation framework for modeling the performance of sparse LU factorization. Our framework uses micro-benchmarks to calibrate the param… more
Date: April 15, 2008
Creator: Univ. of California, San Diego
Partner: UNT Libraries Government Documents Department
open access

Performance analysis of parallel supernodal sparse LU factorization

Description: We investigate performance characteristics for the LU factorization of large matrices with various sparsity patterns. We consider supernodal right-looking parallel factorization on a bi-dimensional grid of processors, making use of static pivoting. We develop a performance model and we validate it using the implementation in SuperLU-DIST, the real matrices and the IBM Power3 machine at NERSC. We use this model to obtain performance bounds on parallel computers, to perform scalability analysis a… more
Date: February 5, 2004
Creator: Grigori, Laura & Li, Xiaoye S.
Partner: UNT Libraries Government Documents Department
open access

Towards an Accurate Performance Modeling of Parallel SparseFactorization

Description: We present a performance model to analyze a parallel sparseLU factorization algorithm on modern cached-based, high-end parallelarchitectures. Our model characterizes the algorithmic behavior bytakingaccount the underlying processor speed, memory system performance, aswell as the interconnect speed. The model is validated using theSuperLU_DIST linear system solver, the sparse matrices from realapplications, and an IBM POWER3 parallel machine. Our modelingmethodology can be easily adapted to stud… more
Date: May 26, 2006
Creator: Grigori, Laura & Li, Xiaoye S.
Partner: UNT Libraries Government Documents Department
open access

Test of weak and strong factorization in nucleus-nucleuscollisions atseveral hundred MeV/nucleon

Description: Total and partial charge-changing cross sections have been measured for argon projectiles at 400 MeV/nucleon in carbon, aluminum, copper, tin and lead targets; cross sections for hydrogen were also obtained, using a polyethylene target. The validity of weak and strong factorization properties has been investigated for partial charge-changing cross sections; preliminary cross section values obtained for carbon, neon and silicon at 290 and 400 MeV/nucleon and iron at 400 MeV/nucleon, in carbon, a… more
Date: June 21, 2006
Creator: La Tessa, Chiara; Sihver, Lembit; Zeitlin, Cary; Miller, Jack; Guetersloh, Stephen; Heilbronn, Lawrence et al.
Partner: UNT Libraries Government Documents Department
open access

Theoretical Studies in Elementary Particle Physics

Description: This final report summarizes work at Penn State University from June 1, 1990 to April 30, 2012. The work was in theoretical elementary particle physics. Many new results in perturbative QCD, in string theory, and in related areas were obtained, with a substantial impact on the experimental program.
Date: April 1, 2013
Creator: Collins, John C. & Roiban, Radu S
Partner: UNT Libraries Government Documents Department
open access

On the Equivalence of Nonnegative Matrix Factorization and K-means- Spectral Clustering

Description: We provide a systematic analysis of nonnegative matrix factorization (NMF) relating to data clustering. We generalize the usual X = FG{sup T} decomposition to the symmetric W = HH{sup T} and W = HSH{sup T} decompositions. We show that (1) W = HH{sup T} is equivalent to Kernel K-means clustering and the Laplacian-based spectral clustering. (2) X = FG{sup T} is equivalent to simultaneous clustering of rows and columns of a bipartite graph. We emphasizes the importance of orthogonality in NMF and … more
Date: December 4, 2005
Creator: Ding, Chris; He, Xiaofeng; Simon, Horst D. & Jin, Rong
Partner: UNT Libraries Government Documents Department
open access

Unvail the Mysterious of the Single Spin Asymmetry

Description: Single transverse-spin asymmetry in high energy hadronic reaction has been greatly investigated from both experiment and theory sides in the last few years. In this talk, I will summarize some recent theoretical developments, which, in my opinion, help to unvail the mysterious of the single spin asymmetry.
Date: January 5, 2010
Creator: Yuan, Feng
Partner: UNT Libraries Government Documents Department
open access

Unsymmetric ordering using a constrained Markowitz scheme

Description: We present a family of ordering algorithms that can be used as a preprocessing step prior to performing sparse LU factorization. The ordering algorithms simultaneously achieve the objectives of selecting numerically good pivots and preserving the sparsity. We describe the algorithmic properties and challenges in their implementation. By mixing the two objectives we show that we can reduce the amount of fill-in in the factors and reduce the number of numerical problems during factorization. On a… more
Date: January 18, 2005
Creator: Amestoy, Patrick R.; S., Xiaoye & Pralet, Stephane
Partner: UNT Libraries Government Documents Department
open access

Multithreading for Synchronization Tolerance in MatrixFactorization

Description: Physical constraints such as power, leakage and pin bandwidth are currently driving the HPC industry to produce systems with unprecedented levels of concurrency. In these parallel systems, synchronization and memory operations are becoming considerably more expensive than before. In this work we study parallel matrix factorization codes and conclude that they need to be re-engineered to avoid unnecessary (and expensive) synchronization. We propose the use of multithreading combined with intelli… more
Date: July 16, 2007
Creator: Buttari, Alfredo; Dongarra, Jack; Husbands, Parry; Kurzak, Jakub & Yelick, Katherine
Partner: UNT Libraries Government Documents Department
open access

Enhancing Scalability of Sparse Direct Methods

Description: TOPS is providing high-performance, scalable sparse direct solvers, which have had significant impacts on the SciDAC applications, including fusion simulation (CEMM), accelerator modeling (COMPASS), as well as many other mission-critical applications in DOE and elsewhere. Our recent developments have been focusing on new techniques to overcome scalability bottleneck of direct methods, in both time and memory. These include parallelizing symbolic analysis phase and developing linear-complexity s… more
Date: July 23, 2007
Creator: Li, Xiaoye S.; Demmel, James; Grigori, Laura; Gu, Ming; Xia,Jianlin; Jardin, Steve et al.
Partner: UNT Libraries Government Documents Department
open access

Updating the Symmetric Indefinite Factorization with Applications in a Modified Newton's Method

Description: In recent years the use of quasi-Newton methods in optimization algorithms has inspired much of the research in an area of numerical linear algebra called updating matrix factorizations. Previous research in this area has been concerned with updating the factorization of a symmetric positive definite matrix. Here, a numerical algorithm is presented for updating the Symmetric Indefinite Factorization of Bunch and Parlett. The algorithm requires only O(n/sup 2/) arithmetic operations to update th… more
Date: June 1977
Creator: Sorensen, Danny C.
Partner: UNT Libraries Government Documents Department
open access

NIKE3D a nonlinear, implicit, three-dimensional finite element code for solid and structural mechanics user's manual update summary

Description: This report provides the NIKE3D user's manual update summary for changes made from version 3.0.0 April 24, 1995 to version 3.3.6 March 24,2000. The updates are excerpted directly from the code printed output file (hence the Courier font and formatting), are presented in chronological order and delineated by NIKE3D version number. NIKE3D is a fully implicit three-dimensional finite element code for analyzing the finite strain static and dynamic response of inelastic solids, shells, and beams. Sp… more
Date: March 24, 2000
Creator: Puso, M; Maker, B N; Ferencz, R M & Hallquist, J O
Partner: UNT Libraries Government Documents Department
open access

Direction-preserving and Schur-monotonic Semi-separable Approximations of Symmetric Positive Definite Matrices

Description: For a given symmetric positive definite matrix A {element_of} R{sup nxn}, we develop a fast and backward stable algorithm to approximate A by a symmetric positive-definite semi-separable matrix, accurate to any prescribed tolerance. In addition, this algorithm preserves the product, AZ, for a given matrix Z {element_of} R{sup nxd}, where d << n. Our algorithm guarantees the positive-definiteness of the semi-separable matrix by embedding an approximation strategy inside a Cholesky factoriz… more
Date: October 20, 2009
Creator: Gu, Ming; Li, Xiaoye Sherry & Vassilevski, Panayot S.
Partner: UNT Libraries Government Documents Department
open access

A block orthogonalization procedure with constant synchronizationrequirements

Description: We propose an alternative orthonormalization method that computes the orthonormal basis from the right singular vectors of a matrix. Its advantage are: (a) all operations are matrix-matrix multiplications and thus cache-efficient, (b) only one synchronization point is required in parallel implementations, (c) could be more stable than Gram-Schmidt. In addition, we consider the problem of incremental orthonormalization where a block of vectors is orthonormalized against a previously orthonormal … more
Date: April 17, 2000
Creator: Stathopoulos, Andreas & Wu, Kesheng
Partner: UNT Libraries Government Documents Department
open access

Rate Equation Theory for Island Sizes and Capture Zone Areas in Submonolayer Deposition: Realistic Treatment of Spatial Aspects of Nucleation

Description: Extensive information on the distribution of islands formed during submonolayer deposition is provided by the joint probability distribution (JPD) for island sizes, s, and capture zone areas, A. A key ingredient determining the form of the JPD is the impact of each nucleation event on existing capture zone areas. Combining a realistic characterization of such spatial aspects of nucleation with a factorization ansatz for the JPD, we provide a concise rate equation formulation for the variation w… more
Date: December 5, 2002
Creator: Evans, J. W.; Li, M. & Bartelt, M. C.
Partner: UNT Libraries Government Documents Department
open access

Benefits of IEEE-754 features in modern symmetric tridiagonaleigensolvers

Description: Bisection is one of the most common methods used to compute the eigenvalues of symmetric tridiagonal matrices. Bisection relies on the Sturm count: For a given shift a, the number of negative pivots in the factorization T - {sigma}I = LDL{sup T} equals the number of eigenvalues of T that are smaller than a. In IEEE-754 arithmetic, the value oo permits the computation to continue past a zero pivot, producing a correct Sturm count when T is unreduced. Demmel and Li showed that using oo rather tha… more
Date: March 12, 2006
Creator: Marques, Osni; Riedy, Jason E. & Vomel, Christof
Partner: UNT Libraries Government Documents Department
open access

LDRD final report : leveraging multi-way linkages on heterogeneous data.

Description: This report is a summary of the accomplishments of the 'Leveraging Multi-way Linkages on Heterogeneous Data' which ran from FY08 through FY10. The goal was to investigate scalable and robust methods for multi-way data analysis. We developed a new optimization-based method called CPOPT for fitting a particular type of tensor factorization to data; CPOPT was compared against existing methods and found to be more accurate than any faster method and faster than any equally accurate method. We exten… more
Date: September 1, 2010
Creator: Dunlavy, Daniel M. & Kolda, Tamara Gibson (Sandia National Laboratories, Livermore, CA)
Partner: UNT Libraries Government Documents Department
open access

Source Apportionment Analysis of Measured Volatile Organic Compounds in Corpus Christi, Texas

Description: Corpus Christi among of the largest industrialized coastal urban areas in Texas. The strategic location of the city along the Gulf of Mexico allows for many important industries and an international business to be located. The cluster of industries and businesses in the region contribute to the air pollution from emissions that are harmful to the environment and to the people living in and visiting the area. Volatile organic compounds (VOC) constitute an important class of pollutants measured i… more
Date: May 2014
Creator: Abood, Ahmed T.
Partner: UNT Libraries
open access

Factoring Algebraic Error for Relative Pose Estimation

Description: We address the problem of estimating the relative pose, i.e. translation and rotation, of two calibrated cameras from image point correspondences. Our approach is to factor the nonlinear algebraic pose error functional into translational and rotational components, and to optimize translation and rotation independently. This factorization admits subproblems that can be solved using direct methods with practical guarantees on global optimality. That is, for a given translation, the corresponding … more
Date: March 9, 2009
Creator: Lindstrom, P & Duchaineau, M
Partner: UNT Libraries Government Documents Department
open access

Making tensor factorizations robust to non-gaussian noise.

Description: Tensors are multi-way arrays, and the CANDECOMP/PARAFAC (CP) tensor factorization has found application in many different domains. The CP model is typically fit using a least squares objective function, which is a maximum likelihood estimate under the assumption of independent and identically distributed (i.i.d.) Gaussian noise. We demonstrate that this loss function can be highly sensitive to non-Gaussian noise. Therefore, we propose a loss function based on the 1-norm because it can accommoda… more
Date: March 1, 2011
Creator: Chi, Eric C. (Rice University, Houston, TX) & Kolda, Tamara Gibson
Partner: UNT Libraries Government Documents Department
Back to Top of Screen