Fully Empirical Autotuned QR Factorization For Multicore Architectures

Computer Science – Distributed – Parallel – and Cluster Computing

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

Tuning numerical libraries has become more difficult over time, as systems get more sophisticated. In particular, modern multicore machines make the behaviour of algorithms hard to forecast and model. In this paper, we tackle the issue of tuning a dense QR factorization on multicore architectures. We show that it is hard to rely on a model, which motivates us to design a fully empirical approach. We exhibit few strong empirical properties that enable us to efficiently prune the search space. Our method is automatic, fast and reliable. The tuning process is indeed fully performed at install time in less than one and ten minutes on five out of seven platforms. We achieve an average performance varying from 97% to 100% of the optimum performance depending on the platform. This work is a basis for autotuning the PLASMA library and enabling easy performance portability across hardware systems.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Fully Empirical Autotuned QR Factorization For Multicore Architectures does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Fully Empirical Autotuned QR Factorization For Multicore Architectures, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Fully Empirical Autotuned QR Factorization For Multicore Architectures will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-301617

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.