With Linux Clusters, Seeing Is Believing

By Roland Piquepaille

As the recent release of the last Top500 list reminded us last month, the most powerful computers now are reaching speeds of dozens of teraflops. When these machines run a nuclear simulation or a global climate model for days or weeks, they produce datasets of tens of terabytes. How to visualize, analyze and understand such massive amounts of data? The answer is now obvious: using Linux clusters. In this very long article, "From Seeing to Understanding," Science & Technology Review looks at the technologies used at Lawrence Livermore National Laboratory (LLNL), which will host the IBM's BlueGene/L next year. Visualization will be handled by a 128- or 256-node Linux cluster. Each node contains two processors sharing one graphic card. Meanwhile, the EVEREST built by Oak Ridge National Laboratory (ORNL), has a 35 million pixels screen piloted by a 14-node dual Opteron cluster sending images to 27 projectors. Now that Linux superclusters have almost swallowed the high-end scientific computing market, they're building momentum in the high-end visualization one. Read more...

Let's start with ORNL's EVEREST.

ORNL's EVEREST is a large-scale immersive venue for data exploration and analysis. Its screen is 30' wide by 8' high -- comparable in size to 150 standard computer displays -- and has a resolution of over 11 thousand by 3 thousand pixels, creating a total pixel space of 35 million pixels.

Below is an early version of this visualization environment. "A nanotechnology application is on the big screen, with GIS and Astrophysics on the desktops." (Credit: ORNL)

The EVEREST screen at ORNL
"Visualizing and sifting through the incredible amount of information generated from massively parallel computer simulations is similar to trying to find a diamond in the desert," said George Fann of ORNL's Computer Science and Mathematics Division.
The power wall, dubbed EVEREST, changes that and provides a rich visual interactive experience and a highly collaborative environment for scientists to analyze their data. EVEREST makes use of commercial graphics and entertainment technologies and off-the-shelf dual-processor personal computers connected by a high-speed network to drive 27 projectors.

This is the third Linux cluster deployed at ORNL to manage the large display environments.

Now, let's turn to LLNL. The article about the visualization efforts there is 9-page long and contains entire sections devoted to visualization advances in recent years or how the visualization process has been transformed with the arrival of powerful graphical processing unit (GPU)-equipped graphics cards.

Please read the whole article to learn more about these subjects. Here, I'll focus only on the hardware part of the project.

Below is a powerwall in Livermore's new Terascale Simulation Facility. "Powerwalls work by aggregating, or 'tiling,' the separate images from many projectors (right) to create one seamless image." (Credit: LLNL)

A powerwall at LLNL

And here is the history of Linux clusters used for visualization at LLNL.

The first Linux visualization cluster deployed at Livermore was the Production Visualization Cluster (PVC). PVC was designed to support unclassified applications on the 11.2-teraops Multiprogrammatic Capability Resource (MCR) machine and is being expanded to support the 22.9-teraops Thunder cluster supercomputer. With 64 nodes, each consisting of two processors and a graphics card, PVC went online in 2002.
By all measures, PVC has been highly successful. It is handling data sets of 23 terabytes to create animations involving 1 billion atoms. PVC generates these animations in about one-tenth the time and at one-fifth the cost of proprietary visualization engines, while simultaneously driving high-resolution displays in conference rooms and on powerwalls.
"PVC is our model for classified ASC [Advanced Simulation and Computing program] visualization engines," says computer scientist and VIEWS [Visual Interactive Environment for Weapons Simulation] program leader Steve Louis. The VIEWS team is preparing to deploy gViz, a 64-node cluster designed to support White, with each node consisting of two processors running at 3 gigahertz and sharing one graphics card. Similar clusters are planned to support Purple and BlueGene/L.
One important advantage of Linux cluster visualization engines is that clusters can be expanded easily. PVC is being tripled in size to support the unclassified demands brought on by Thunder. Similarly, gViz2 is a planned expansion of gViz to either 128 or 256 nodes.

The article also describes how all applications have been rewritten to run on the Linux clusters. Here is a short quote.

The new cluster software is open source, which means that the source code -- the software's programming code -- is freely available on the Internet through such organizations as SourceForge. "A benefit of this approach is that the public can use our software, make improvements, and notify us if they find any bugs," says VIEWS visualization project leader Sean Ahern.

So what is the final verdict about these Linux clusters?

The nearly unanimous opinion about the new Linux clusters is strong approval, if not downright devotion. "Users are impressed with the clusters," says computer scientist Hank Childs, who helps DNT physicists visualize complicated simulations on PVC for unclassified, stockpile stewardship

Famous quotes containing the word believing:

    I love all waste
    And solitary places; where we taste
    The pleasure of believing what we see
    Is boundless, as we wish our souls to be.
    Percy Bysshe Shelley (1792–1822)