Computer Science – Numerical Analysis
Scientific paper
2012-03-08
Computer Science
Numerical Analysis
Scientific paper
We present an optimized single-precision implementation of the Sparse Approximate Matrix Multiply (\SpAMM{}) [M. Challacombe and N. Bock, arXiv {\bf 1011.3534} (2010)], a fast algorithm for matrix-matrix multiplication for matrices with decay that achieves an $\mathcal{O} (n \ln n)$ computational complexity with respect to matrix dimension $n$. We find that the max norm of the error achieved with a \SpAMM{} tolerance below $2 \times 10^{-8}$ is lower than that of the single-precision {\tt SGEMM} for dense quantum chemical matrices, while outperforming {\tt SGEMM} with a cross-over already for small matrices ($n \sim 1000$). Relative to naive implementations of \SpAMM{} using Intel's Math Kernel Library ({\tt MKL}) or AMD's Core Math Library ({\tt ACML}), our optimized version is found to be significantly faster. Detailed performance comparisons are made for quantum chemical matrices with differently structured sub-blocks. Finally, we discuss the potential of improved hardware prefetch to yield 2--3x speedups.
Bock Nicolas
Challacombe Matt
No associations
LandOfFree
An Optimized Sparse Approximate Matrix Multiply does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with An Optimized Sparse Approximate Matrix Multiply, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and An Optimized Sparse Approximate Matrix Multiply will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-16781