NVIDIA released CUDA 5, you can download it for free at the company's Developer Zone website. The new update promises to make NVIDIA's parallel computing platform easier than ever, it comes with new features like Dynamic Parallelism to bring GPU acceleration to new algorithms, GPU-callable libraries, GPUDirect Support for RDMA to minimize memory bottlenecks, and NVIDIA Nsight Eclipse Edition to generate and debug CUDA code.
NVIDIA today made available the NVIDIA® CUDA® 5 production release, a powerful new version of the world's most pervasive parallel computing platform and programming model for accelerating scientific and engineering applications on GPUs. It can be downloaded for free from the NVIDIA Developer Zone website.
With more than 1.5 million downloads, supporting more than 180 leading engineering, scientific and commercial applications, the CUDA programming model is the most popular way for developers to take advantage of GPU-accelerated computing.
Building on this success, the new programming features of the CUDA 5 platform make the development of GPU-accelerated applications faster and easier than ever, including support for dynamic parallelism, GPU-callable libraries, NVIDIA GPUDirect™ technology support for RDMA (remote direct memory access) and the NVIDIA Nsight™ Eclipse Edition integrated development environment (IDE).
Developer Accolades for CUDA 5
Developers who evaluated the pre-release version of CUDA 5 have reported often dramatic application acceleration and improved programmability.
The defense and aerospace industries realize the benefits of CUDA GPU acceleration for processing images, video and sensor data, such as radar. According to Dustin Franklin, GPGPU applications engineer at GE Intelligent Platforms in Charlottesville, Va., "CUDA 5 is a significant technology for us. Many of the applications we're using involve streaming sensor data directly into the GPU with low latency, so the GPUDirect support for RDMA on new Kepler GPUs is incredibly important for our customers. We have integrated support for many custom sensors already and are very happy with the results."
Guillaume Belz, a research biochemist at Lyon University Hospital in Lyon, France, has been using dynamic parallelism and GPU-callable libraries for complex signal analysis and data mining. "With GPU acceleration, we can get results in several hours for projects that used to require weeks or even months with CPUs alone. Without GPU acceleration, analysis is not possible at all," said Belz.
Weihua (Wayne) Sun, Ph.D. candidate in imaging science at Rochester Institute of Technology in New York, was impressed with NVIDIA Nsight Eclipse Edition. "When I learned that CUDA 5 included the new Nsight Eclipse Edition IDE, I knew I needed it right away. Having all my programming, debugging and optimization tools in a single integrated development environment is a great productivity boost for me."
New CUDA 5 Features
CUDA 5 enables developers to take full advantage of the performance of NVIDIA GPUs, including GPU accelerators based on the NVIDIA Kepler™ compute architecture -- the fastest, most efficient, highest-performance computing architecture ever built. Key features include:
Dynamic Parallelism - Brings GPU acceleration to new algorithms
GPU threads can dynamically spawn new threads, allowing the GPU to adapt to the data. By minimizing the back and forth with the CPU, dynamic parallelism greatly simplifies parallel programming. And it enables GPU acceleration of a broader set of popular algorithms, such as those used in adaptive mesh refinement and computational fluid dynamics applications.
GPU-Callable Libraries - Enables third-party ecosystem
A new CUDA BLAS library allows developers to use dynamic parallelism for their own GPU-callable libraries. They can design plug-in APIs that allow other developers to extend the functionality of their kernels, and allow them to implement callbacks on the GPU to customize the functionality of third-party GPU-callable libraries. The "object linking" capability provides an efficient and familiar process for developing large GPU applications by enabling developers to compile multiple CUDA source files into separate object files, and link them into larger applications and libraries.
GPUDirect Support for RDMA - Minimizes system memory bottlenecks
GPUDirect technology enables direct communication between GPUs and other PCI-E devices, and supports direct memory access between network interface cards and the GPU. It also significantly reduces MPISendRecv latency between GPU nodes in a cluster and improves overall application performance.
NVIDIA Nsight Eclipse Edition - Generate CUDA code quickly and easily
NVIDIA Nsight Eclipse Edition enables programmers to develop, debug and profile GPU applications within the familiar Eclipse-based IDE on Linux and Mac OS X platforms. An integrated CUDA editor and CUDA samples speed the generation of CUDA code, and automatic code refactoring enables easy porting of CPU loops to CUDA kernels. An integrated expert analysis system provides automated performance analysis and step-by-step guidance to fix performance bottlenecks in the code, while syntax highlighting makes it easy to differentiate GPU code from CPU code.
New Online CUDA Resource Center
To help developers maximize the potential of parallel computing with CUDA technology, NVIDIA has launched a free online resource center for CUDA programmers at http://docs.nvidia.com. The site offers the latest information on the CUDA platform and programming model, as well as access to all CUDA developer documentation and technologies, including tools, code samples, libraries, APIs, and tuning and programming guides.