Added: 1 year ago
From: jahkr
Views: 6,274
Sort by time | Sort by thread (beta)

Link to this comment:

Share to:

All Comments (4)

Sign In or Sign Up now to post a comment!
  • Thanks for your reply. I never tried GPU, but there are some algorithms maybe you can use (such as link-list) to improve the searching speedup. By the way, in your code, which type of memory you are using for the searching neighboring partilces: global memory or shared memory. Is there any reference material recommend for particle programming in GPU?

  • OK, in my CPU code I'm using a tree search algorithm, and it's similar to linked-list. In the GPU code I'm using shared memory because it's very fast for access to all data at same time, the problem is the shared memory it's only 16k (GTS 250).

    About the reference material, I began with the OpenCL and CUDA online reference and after tried to understand and modify the example codes in CUDA SDK.

  • Excellent simulation. How about the cuda speed up and how about data type used? single float to double precious?

  • Thanks for comment.

    I have used float4 data types to improve the performance. The application run at 240 Gflops with 25k particles, but the main problem is that number of operations increase with n^2 (with n=number of particles). In the CPU version I'm using a neighbour search optimization and the number of operations increase with n*log10(n), this means that using large particle numbers CPU is faster than GPU. I'm trying to adapt the neighbour serach to GPU, but isn't easy.

Loading...
Alert icon
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more