An Optimized Sparse Approximate Matrix Multiply

Computer Science – Numerical Analysis

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

We present an optimized single-precision implementation of the Sparse Approximate Matrix Multiply (\SpAMM{}) [M. Challacombe and N. Bock, arXiv {\bf 1011.3534} (2010)], a fast algorithm for matrix-matrix multiplication for matrices with decay that achieves an $\mathcal{O} (n \ln n)$ computational complexity with respect to matrix dimension $n$. We find that the max norm of the error achieved with a \SpAMM{} tolerance below $2 \times 10^{-8}$ is lower than that of the single-precision {\tt SGEMM} for dense quantum chemical matrices, while outperforming {\tt SGEMM} with a cross-over already for small matrices ($n \sim 1000$). Relative to naive implementations of \SpAMM{} using Intel's Math Kernel Library ({\tt MKL}) or AMD's Core Math Library ({\tt ACML}), our optimized version is found to be significantly faster. Detailed performance comparisons are made for quantum chemical matrices with differently structured sub-blocks. Finally, we discuss the potential of improved hardware prefetch to yield 2--3x speedups.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

An Optimized Sparse Approximate Matrix Multiply does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with An Optimized Sparse Approximate Matrix Multiply, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and An Optimized Sparse Approximate Matrix Multiply will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-16781

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.