Computer Science – Distributed – Parallel – and Cluster Computing
Scientific paper
2010-11-04
Computer Science
Distributed, Parallel, and Cluster Computing
7 pages
Scientific paper
In this note we briefly describe our Cholesky modification algorithm for streaming multiprocessor architectures. Our implementation is available in C++ with Matlab binding, using CUDA to utilise the graphics processing unit (GPU). Limited speed ups are possible due to the bandwidth bound nature of the problem. Furthermore, a complex dependency pattern must be obeyed, requiring multiple kernels to be launched. Nonetheless, this makes for an interesting problem, and our approach can reduce the computation time by a factor of around 7 for matrices of size 5000 by 5000 and k=16, in comparison with the LINPACK suite running on a CPU of comparable vintage. Much larger problems can be handled however due to the O(n) scaling in required GPU memory of our method.
No associations
LandOfFree
Rank k Cholesky Up/Down-dating on the GPU: gpucholmodV0.2 does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Rank k Cholesky Up/Down-dating on the GPU: gpucholmodV0.2, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Rank k Cholesky Up/Down-dating on the GPU: gpucholmodV0.2 will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-146086