Physics – Plasma Physics
Scientific paper
2011-11-22
Physics
Plasma Physics
25 pages, 6 figures, submitted to J. Comput. Phys
Scientific paper
Recently, a fully implicit, energy- and charge-conserving particle-in-cell method has been proposed for multi-scale, full-f kinetic simulations [G. Chen, et al., J. Comput. Phys. 230,18 (2011)]. The method employs a Jacobian-free Newton-Krylov (JFNK) solver, capable of using very large timesteps without loss of numerical stability or accuracy. A fundamental feature of the method is the segregation of particle-orbit computations from the field solver, while remaining fully self-consistent. This paper describes a very efficient, mixed-precision hybrid CPU-GPU implementation of the implicit PIC algorithm exploiting this feature. The JFNK solver is kept on the CPU in double precision (DP), while the implicit, charge-conserving, and adaptive particle mover is implemented on a GPU (graphics processing unit) using CUDA in single-precision (SP). Performance-oriented optimizations are introduced with the aid of the roofline model. The implicit particle mover algorithm is shown to achieve up to 400 GOp/s on a Nvidia GeForce GTX580. This corresponds to 25% absolute GPU efficiency against the peak theoretical performance, and is about 300 times faster than an equivalent serial CPU (Intel Xeon X5460) execution. For the test case chosen, the mixed-precision hybrid CPU-GPU solver is shown to over-perform the DP CPU-only serial version by a factor of \sim 100, without apparent loss of robustness or accuracy in a challenging long-timescale ion acoustic wave simulation.
Barnes Daniel C.
Chacón Luis
Chen Guangye
No associations
LandOfFree
An efficient mixed-precision, hybrid CPU-GPU implementation of a fully implicit particle-in-cell algorithm does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with An efficient mixed-precision, hybrid CPU-GPU implementation of a fully implicit particle-in-cell algorithm, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and An efficient mixed-precision, hybrid CPU-GPU implementation of a fully implicit particle-in-cell algorithm will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-554140