This is a video from a case study/tech our friends from Instinct Tech (www.instinct-tech.com) did for us. We wanted to know how easily a technology like CUDA could be used from inside Instinct Studio (the game engine were using for our WIP title DogFighter [www.dark-water-studios.com]) and if the additional horse power can be utilized for some game mechanics.
The demo shows 4096 bot planes handled solely on a single GPU parallel to game rendering. The bot planes use steering behaviours for flocking, navigation and obstacle avoidance. The planes are fully lit and rendered (with shadows). The demo runs with interactive frame rate on main stream CUDA enabled graphics cards. In comparison the same simulation without utilizing CUDA achieved a similar frame rate on a decent machine with only 512 planes in our tests. The steering computation for 512 planes requires about 260.000 neighbour queries while for 4096 planes this grows to a whopping 16 Million queries. The algorithm can be easily parallelized, explaining the advantage of technologies like CUDA for this kind of problem. Even if there is a potential to optimize the algorithm for CPU the clear benefit for us is a heavily reduced development time.
This demonstrates collision avoidance only - is it technically possible to use CUDA for calculating actual collisions and let planes crash into the scenery and into each other and fall to the ground? How would this affect the performance?
CaptainClass 2 years ago
It is generally possible but conceptually a different things than the demo. NVidia is accelerating PhysX with CUDA, which is basically what you ask for. Performance should be generally much less as the problem can't be parallelized as much.
DarkWaterDev 2 years ago
Do you really test the position of every plane against the position of every other plane? An octree-like data structure would be so much more efficient. Is it really right to go with a less efficient algorithm just to be able to do more things in parallel (or just because CUDA evaluates 'if' branches sequentially)?
linuxx0r 2 years ago 3
The showcase demonstrates that you can have computation heavy workloads on the GPU without sacrificing rendering quality too much. You can easily optimize the brute force algorithm used and spend the free cycles for other tasks, but that's a different story.
DarkWaterDev 2 years ago