NVIDIA today moved its general purpose GPU computing ahead on Monday with CUDA 4.0. The update focuses mostly on parallelism between graphics cards and multi-core processors. GPUDirect 2.0 now lets graphics chips on a local system talk directly to each other rather than have to pass through the processor. Multiple CPU cores can also now share their threads with a single graphics card, and a single processor core can speak to multiple graphics cards to offload multiple heavy tasks.
CUDA also now has a unified virtual memory space that both the main processor and graphics can use. Outside of sharing, it now has “added support” for Mac OS X, likely tied to cuda-gdb code. New code libraries can accelerate sorting by as much as 100 times as well as speed up the creation of apps that depend heavily on imaging to work.
For those of you who heads are spinning reading this, in short this tech lets the NVIDIA graphics cards processors, AKA GPUs, to do tasks in the past that only the CPU of the machines would do. By off loading that work you get faster performance witht he same hardware. In some cases some of those tasks that GPU can even do better than the CPU. Adobe uses CUDA in Premiere Pro to get have fluid, real-time video editing experience for example.
A near-final, release candidate version of the CUDA 4.0 toolkit is due to be posted for free on March 4. NVIDIA hasn’t said when it expected the code to go final. CUDA is NVIDIA’s proprietary general purpose code and only accelerates calculations on GeForce cards.