An efficient mixed-precision, hybrid CPU-GPU implementation of a fully implicit particle-in-cell algorithm

Physics – Plasma Physics

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

25 pages, 6 figures, submitted to J. Comput. Phys

Scientific paper

Recently, a fully implicit, energy- and charge-conserving particle-in-cell method has been proposed for multi-scale, full-f kinetic simulations [G. Chen, et al., J. Comput. Phys. 230,18 (2011)]. The method employs a Jacobian-free Newton-Krylov (JFNK) solver, capable of using very large timesteps without loss of numerical stability or accuracy. A fundamental feature of the method is the segregation of particle-orbit computations from the field solver, while remaining fully self-consistent. This paper describes a very efficient, mixed-precision hybrid CPU-GPU implementation of the implicit PIC algorithm exploiting this feature. The JFNK solver is kept on the CPU in double precision (DP), while the implicit, charge-conserving, and adaptive particle mover is implemented on a GPU (graphics processing unit) using CUDA in single-precision (SP). Performance-oriented optimizations are introduced with the aid of the roofline model. The implicit particle mover algorithm is shown to achieve up to 400 GOp/s on a Nvidia GeForce GTX580. This corresponds to 25% absolute GPU efficiency against the peak theoretical performance, and is about 300 times faster than an equivalent serial CPU (Intel Xeon X5460) execution. For the test case chosen, the mixed-precision hybrid CPU-GPU solver is shown to over-perform the DP CPU-only serial version by a factor of \sim 100, without apparent loss of robustness or accuracy in a challenging long-timescale ion acoustic wave simulation.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

An efficient mixed-precision, hybrid CPU-GPU implementation of a fully implicit particle-in-cell algorithm does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with An efficient mixed-precision, hybrid CPU-GPU implementation of a fully implicit particle-in-cell algorithm, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and An efficient mixed-precision, hybrid CPU-GPU implementation of a fully implicit particle-in-cell algorithm will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-554140

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.