The Rise of The GPU

By Roland Piquepaille

Today, your computer has at least two processors, the main unit or CPU, and the GPU, dedicated to graphics. But it's a little-known secret that the GPU is now much more powerful that the CPU. The GPU has some drawbacks, such as a small memory, a difficult programming model or an almost general lack of floating-point precision. However, it's tempting to harness the GPU power to help the main CPU to do the general computation. In "Supernova collapse simulated on a GPU," EE Times describes how computer scientists from Los Alamos National Laboratory have developed the Scout project to do this. With the use of an Nvidia Quadro 3400 card, "Scout has achieved improved computational rates that are roughly 20 times faster than a 3-GHz Intel Xeon EM64T processor without the use of streaming SIMD extensions, and approximately four times faster than SIMD-enabled, fully optimized code." Impressive, isn't? Now, they want to go further and operate hundreds of GPUs in parallel. Read more about this exciting development...

Over the last several years-driven primarily by the entertainment industry-commodity graphics hardware has seen rapid enhancements in terms of both performance and programmability. The performance improvements have been significant enough that the graphics processor (GPU) now has roughly an order of magnitude more computing power and memory bandwidth than the CPU. This has led to our study of techniques that leverage the power of the GPU for improving the performance of visualization applications as well as for general-purpose computation.
As part of the Scout project toward a hardware-accelerated system for quantitatively driven visualization and analysis, we have devised a software environment and programming language that lets scientists write simple, expressive data-parallel programs to enable the computation of derived values and direct control of the mapping from data values to the pixels of a final rendered image.

Does this work? Yes, and pretty well!

This is all accomplished within an integrated development environment that provides on-the-fly compilation of code and the interactive exploration of the rendered results. Scout has achieved improved computational rates that are roughly 20 times faster than a 3-GHz Intel Xeon EM64T processor without the use of streaming SIMD extensions, and approximately four times faster than SIMD-enabled, fully optimized code.

Below are two examples of their results.

NASA's Synthetic Vision (View 1) Here is a sample Scout code -- pretty simple to understand -- and the resulting image showing the color mapped potential temperatures and black landmasses. (Credit: Scout project).
NASA's Synthetic Vision (View 2) And here are the volume rendered results of two selected entropy ranges colored by corresponding velocity magnitude. Both the entropy and velocity magnitude were computed directly using Scout. (Credit: Scout project)

Here are more details about the above image.

The first entropy range was partially clipped away to reveal the turbulent structure of the supernova's core, and the second (more transparent) entropy range isolated the details of the shock front.
Both ranges of entropy values were colored by the corresponding velocity magnitude values within the simulation. The entropy and velocity magnitude values, which are stored on a 256 x 256 x 256 computational grid, were computed in approximately 0.22 second using an Nvidia Quadro 3400 card.

Even if these results are very promising, there are still some major challenges before using successfully a GPU as a general-purpose computational resource: transfers between the GPU and the CPU are slow; developing software for the GPU is difficult; there is almost no support today for floating-point precision; and graphics cards don't incorporate large memories. Still, the researchers are optimistic these challenges can be solved. Here is what they are dreaming about.

We believe that the performance numbers, the rapid rate of innovations from the graphics hardware vendors and the recent announcement of support for multiple GPUs in a single desktop system show that the study of the GPU's impact on general-purpose computing is a viable area for continued research.
In addition, the GPU can provide scientists with a substantial resource for their desktop systems that can be leveraged to provide interactive data exploration and analysis. We are actively exploring the use of hundreds of GPUs in parallel, within a cluster-based environment, to address the memory limitations and explore the scalability of such systems.

The last paper about this project has been published in the Proceedings of IEEE Visualization 2004 (pages 171-178, October 2004). Here is a link to the full paper named "Scout: A Hardware-Accelerated System for Quantitatively Driven Visualization and Analysis" (PDF format, 8 pages, 498 KB). The above illustrations, including some Scout sample codes were extracted from this paper.

And for even more information, you can read the EE Times for a special report about "the graphics chip as supercomputer." It contains links to five more articles, including one on "programming GPUs for general computing."

Sources: Patrick McCormick, for EE Times, December 13, 2004; and various other websites

Related stories can be found in the following categories.


Famous quotes containing the word rise:

    Ah, Faustus,
    Now hast thou but one bare hour to live,
    And then thou must be damned perpetually!
    Stand still, you ever-moving spheres of heaven,
    That time may cease and midnight never come!
    Fair Nature’s eye, rise, rise again and make
    Perpetual day; or let this hour be but
    A year, a month, a week, a natural day,
    That Faustus may repent and save his soul!
    Christopher Marlowe (1564–1593)