Nvidia To Launch CUDA 4.0 For Software Developers This Week

Nvidia on Monday offered details regarding its upcoming CUDA 4.0 software developer kit (SDK), which the company will launch Friday.

Nvidia says the latest version of its parallel-processing SDK will allow developers to share their GPUs' computational power across multiple threads, while offering single thread access to all Nvidia GPUs. In addition, CUDA 4.0 comes with new C/C++ features enabling unified virtual addressing and the creation of GPU-based computational applications such as sophisticated image recognition on mobile handheld PCs.

In an interview with CRN, Sanford Russell, director of CUDA marketing at Nvidia, said that as the computational side of Nvidia's GPU technology, Nvidia's CUDA parallel processing architecture, also known as Fermi, has improved consistently over its first four generations.

"Most recently, Nvidia launched its first CUDA architecture derivates last year at our GPU Technology Conference in San Jose," Russell said. "Our goal now is to allow the developer to take the same code and run it on anything -- on a range of devices as well as operating systems."

Sponsored post

Russell said developers can now choose between Intel's x86 technology or Nvidia's upcoming CPU offering, the ARM-based Project Denver processors. In addition, he said CUDA 4.0 will allow developers to run Nvidia GPUs on Android, Linux, or Windows.

"We're providing architecture that will work across them all," Russell said. "From a program standpoint it should be the same code. That's our vision. There's no one else that would say they'll run the same platform, whether it's on ARM or x86 technology, on any device from a supercomputer down to a smartphone."

Russell said Nvidia's latest version of CUDA allows users to more easily port their parallel applications using auto-performance analysis, C++ debugging, or a GPU Binary Disassemble on either Windows or Mac OS. In addition, he said CUDA 4.0 includes developer tools such as Nvidia GPU Direct version 2.0 and GPU-accelerated MPI.

Russell said CUDA 4.0 enables faster multi-GPU programming and application porting using unified virtual-addressing and C++ algorithms and data structures, which Nvidia calls Thrust. Russel said Nvidia offers high-performance, open source C++ parallel algorithms and data structures with CUDA 4.0, allowing developers to automatically chooses the fastest code path at compile time, or divide up compute-intensive workloads between their GPUs and CPUs.

Next: Russell On Nvidia's Supercomputing Plans

In addition, Russell said that Nvidia is becoming the leading "supercomputing" company, citing demonstrations in January at CES 2011 of several devices running Nvidia's Tegra 2 mobile processors, including Motorola and LG smartphones and Samasung's Galaxy tablet.

Russell also mentioned the launch at MWC 2011 earlier this month of Nvidia's next-generation four-core Kal-El smartphone processors next year.

Russell pointed out that Nvidia showed its processors running on actual devices at CES, rather than showing Powerpoint slides like its rivals. Russell was likely referring to direct rival AMD, whose Fusion APU integrated graphics platform includes the Llano mobile processor, scheduled to come to market in the middle of this year.