Alert icon
We're changing our privacy policy. This stuff matters.  Learn more  Dismiss

Real-Time N-BODY simulation accelerated by GPU

Loading...

Sign in or sign up now!
23,783
Loading...
Alert icon
Sign in or sign up now!
Alert icon

Uploaded by on Jun 9, 2007

website: http://progrape.jp/cs/
GPU: GeForce8800GTX
CPU: Intel Core2Duo 6600
M/B: ASUS P5L-MX (micro-ATX type)
RAM: 1GB
PowerUnit: 550 Watt max.

Category:

Entertainment

Tags:

License:

Standard YouTube License

  • likes, 1 dislikes

Link to this comment:

Share to:

Uploader Comments (cunbody1)

  • So 334.8 Gflops is the peak performance?

  • The theoretical peak peformance of GeForce8800GTX(G80) is 518.4 Gflops.

    It comes from 1.35GHz(SP clock) x 128(# of SPs) x 3 (1 add, 2 mult).

    There is a room for discussion that we can't know the 2 mult operation can work simultaneously or not. Some nvidia people said it work, but some nvidia people said it cant't work.

    If 2 mult operation can't work simultaneously, the theoretical peak performance becomes 345.6 Gflops for GeForce8800GTX(G80).

  • Nice. I recently wrote a modified direct summation algorithm on a Radeon 4850 and got ~ 35 billion interactions per second, or 0.7 Teraflops - enough for an 80k particle system in real time.

  • How do u get it? but ATI claimes that hd 4850 has 1 TFlops...

  • This demonstration was the earliest attempt in the world for n-body calculation using the unified shader hardware.

    So, currently most of GPU hardware adopt the unified shader

    such as ATI cards.

    And we know the ATI card has much more floating-point hardware and it's potential exceeds that of GeForce.

    But the basic approach, we call the chamomile scheme, becomes same one in which we should use small on-chip memory for storing J-particles as much as possible.

  • For direct summation algorithm,

    there exists 2 problems for the implementaion on demo.

    First.

    The most part of kernel implementation is enough for single-precision, but only the subtraction of coordinates and accumulation of pair-wised force needs double-precision accuracy.

    Second.

    If we need to calculate dense steller objects such as a globular cluster or center of galaxy, we need to use a clever scheme for integration such as high order hermite integration with block time-step.

  • In the block time-step, the performance of GPU strongly depends on the performance in the case with very small ni particles with large nj particles. In this kernel used in demo, the performance such above is very small, so I do not recommend you to use your kernel for dense stellar N-body simulation, which has good performance only in the case of NxN interaction.

Top Comments

  • n-body is the calculation of movements of more than 2 gravitationally influencing bodies in space. It is also heavily tied to chaos theory as any insignificant deviation of a starting position and/or velocity of even one body results very quickly in very different outcomes for the entire group.

    A star cluster or galaxy is a good example.

  • Nice

see all

All Comments (24)

Sign In or Sign Up now to post a comment!
  • wonderfull

    

  • If you multiply the Newtonian gravitational force by a cosine with a wavelength of around 50 thousand light years I think you will see the mechanism for generating forms like ring galaxies, and spiral galaxies, evolving from a simple spherical Gaussian initial distribution. As the core condenses a spherical gravitational potential hill develops where the cosine is negative. The hill should accrete a shell of masses that should eventually collapse to a ring due to intra-ring mass interactions.

  • If you modify the gravity law from K(G)/dd, i.e. Newtonian, wherein the force is inversely proportional to the square of the distance, by multiplying it by a wave factor, cos(kd), with a wavelength kd/2pi on the scale of e.g. 1/3 the screenwidth for the viewpoint used at 1:56 in this video, I think you'll get a more realistic galaxy-type shape evolution. Consider the change as a low-energy quantum gravity expression. I suppose it cuts down the sim rate a bit, but I think the effect is worth it.

  • Hi I'm programming a N body simulation for school. Do you know where can I get a sample algorithm for 2D simulation?

    Thnx

  • DirectX and some read-back commands. This creates some overhead, however, it could go about 10% faster without overhead... so no treecode or anything much more fancy.

    "1 TFlop" is for MAD-operations only, for Nbody you need some multiplications and one square root as well, so ~700 Gflops is still surprisngly good.

    I hope we will see OpenCL or DirectX 11 real soon, since currently ATI cards are MUCH faster for scientific calculations than Nvidia-cards and I don't like CUDA much.

Loading...
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more