Astronomy and Astrophysics – Astrophysics
Scientific paper
2005-11-02
New Astron.12:169-181,2006
Astronomy and Astrophysics
Astrophysics
32 pages, 2 figures
Scientific paper
10.1016/j.newast.2006.07.007
The main performance bottleneck of gravitational N-body codes is the force calculation between two particles. We have succeeded in speeding up this pair-wise force calculation by factors between two and ten, depending on the code and the processor on which the code is run. These speedups were obtained by writing highly fine-tuned code for x86_64 microprocessors. Any existing N-body code, running on these chips, can easily incorporate our assembly code programs. In the current paper, we present an outline of our overall approach, which we illustrate with one specific example: the use of a Hermite scheme for a direct N^2 type integration on a single 2.0 GHz Athlon 64 processor, for which we obtain an effective performance of 4.05 Gflops, for double precision accuracy. In subsequent papers, we will discuss other variations, including the combinations of N log N codes, single precision implementations, and performance on other microprocessors.
Hut Piet
Makino Junichiro
Nitadori Keigo
No associations
LandOfFree
Performance Tuning of N-Body Codes on Modern Microprocessors: I. Direct Integration with a Hermite Scheme on x86_64 Architecture does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Performance Tuning of N-Body Codes on Modern Microprocessors: I. Direct Integration with a Hermite Scheme on x86_64 Architecture, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Performance Tuning of N-Body Codes on Modern Microprocessors: I. Direct Integration with a Hermite Scheme on x86_64 Architecture will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-639349