Intel Corp. today announced that all the compute modules of Aurora, an exascale supercomputer it’s helping to build for the U.S. Department of Energy, have been installed.
The system is a collaboration between the Energy Department, Intel and Hewlett Packard Enterprise Co. It’s located at the Argonne National Laboratory. Scientists will use the system to run artificial intelligence models, simulations and large-scale data analytics applications.
Aurora is expected to achieve a theoretical peak performance of more than two exaflops later this year. That will make it nearly twice as fast as the world’s fastest operational supercomputer, another Energy Department system called Frontier. One exaflop equals a billion billion calculations per second.
“While we work toward acceptance testing, we’re going to be using Aurora to train some large-scale open source generative AI models for science,” said Rick Stevens, an associate laboratory director at Argonne National Laboratory. “Aurora, with over 60,000 Intel Max GPUs, a very fast I/O system, and an all-solid-state mass storage system, is the perfect environment to train these models.”
Aurora comprises 10,624 compute modules known as blades. Those blades, which weigh 70 pounds each, run in 166 refrigerator-sized cabinets. The fully assembled system takes up the same amount of space as two professional basketball courts.
Each Aurora blade includes two central processing units from Intel’s Xeon Max Series CPU chip line. There are also six Intel Max Series GPU graphics cards. The processors are supported by memory chips, network equipment and cooling gear built into each blade.
Intel’s Xeon Max Series CPU chips are based on a 10-nanometer architecture. They’re optimized for workloads such as AI models that require the ability to frequently move data to and from memory. To accelerate such workloads, the CPUs include a type of high-speed memory called HBM that wasn’t available in earlier Intel chips.
Intel’s Max Series GPUs, which form the other core building block of Aurora, are also optimized for AI workloads. The language in which a graphics card expresses computations is known as its instruction set. The instruction set of Intel’s Max Series GPUs is specifically geared towards matrix multiplications, the mathematical operations that AI models use to process data.
The chips also include up to 128 tray tracing units. Ray tracing is a method of rendering lighting and shadow effects. According to Intel, the technology speeds up scientific applications’ data visualization features.
Overall, Aurora features 21,248 CPUs and 63,744 graphics cards. That makes it the world’s largest GPU cluster. The chips are supported by a 220 petabyte pool of object storage that Aurora will use to store scientific applications’ data.
Making full use of Aurora’s performance requires researchers to specifically optimize their applications for the system. To ease the task, the Energy Department has created a miniature version of Aurora called Sunspot. It provides an environment in which researchers can test different software optimization methods.
More than a dozen research teams were using the system as of earlier this year. Once Aurora becomes operational, the teams will start moving over code from Sunspot. Early Aurora users will focus on identifying any technical issues that may have to be resolved before the first production applications can be deployed.
Your vote of support is important to us and it helps us keep the content FREE.
One-click below supports our mission to provide free, deep and relevant content.
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.