Lawrence Livermore National Laboratory (LLNL), Hewlett Packard Enterprise (HPE) and Advanced Micro Devices, Inc. (AMD) today announced the selection of AMD as the node supplier for El Capitan, projected to be the world’s most powerful supercomputer when it is fully deployed in 2023.
With its advanced computing and graphics processing units (CPUs/GPUs), El Capitan’s peak performance is expected to exceed 2 exaFLOPS, ensuring the National Nuclear Security Administration (NNSA) laboratories — LLNL, Sandia National Laboratories, and Los Alamos National Laboratory — can meet their primary mission of keeping the nation’s nuclear stockpile safe, secure and reliable. (An exaFLOP is 1 quinillion floating point operations per second.)
Funded by the Advanced Simulation and Computing (ASC) program at the Department of Energy’s (DOE) National Nuclear Security Administration, El Capitan will perform complex and increasingly predictive modeling and simulation for the NNSA’s vital Life Extension Programs (LEPs), which address weapons aging and emergent threat issues in the absence of underground nuclear testing.
“This unprecedented computing capability, powered by advanced CPU and GPU technology from AMD, will sustain America’s position on the global stage in high performance computing and provide an observable example of the commitment of the country to maintaining an unparalleled nuclear deterrent,” said LLNL Director Bill Goldstein. “Today’s news provides a prime example of how government and industry can work together for the benefit of the entire nation.”
El Capitan will be powered by next-generation AMD EPYC processors, codenamed “Genoa” and featuring the “Zen 4” processor core, next-generation AMD Radeon Instinct GPUs based on a new compute-optimized architecture for workloads including HPC and AI, and the AMD Radeon Open Compute platform (ROCm) heterogenous computing software. The nodes will support simulations used by the NNSA labs to address the demands of the LEPs, whose computational requirements are growing due to the ramping up of stockpile modernization efforts and in response to rapidly evolving threats from America’s adversaries.
Providing enormous computation capability for the energy used, the GPUs will provide the majority of the peak floating-point performance of El Capitan. This enables LLNL scientists to run high-resolution 3D models quicker, as well as increase the fidelity and repeatability of calculations, thus making those simulations truer to life.
“We have been pursuing a balanced investment effort at NNSA in advancing our codes, our platforms and our facilities in an integrated and focused way,” said Michel McCoy, Weapon Simulation and Computing Program Director at LLNL. “And our teams and industrial partners will deliver this capability as planned to the nation. Naturally, this has required an intimate, sustained partnership with our industry technology partners and between the tri-labs to be successful.”
Anticipated to be one of the most capable supercomputers in the world, El Capitan will have a significantly greater per-node capability than any current systems, LLNL researchers said. El Capitan’s graphics processors will be amenable to AI and machine learning-assisted data analysis, further propelling LLNL’s sizable investment in AI-driven scientific workloads. These workloads will supplement scientific models that researchers hope will be faster, more accurate and intrinsically capable of quantifying uncertainty in their predictions, and will be increasingly used for stockpile stewardship applications. The use of AMD’s GPUs also is anticipated to dramatically increase El Capitan’s energy efficiency as compared to systems using today’s graphical processors.
“El Capitan will drive unprecedented advancements in HPC and AI, powered by the next-generation AMD EPYC CPUs and Radeon Instinct GPUs,” said Forrest Norrod, senior vice president and general manager, Datacenter and Embedded Systems Group, AMD. “Building on our strong foundation in high-performance computing and adding transformative coherency capabilities, AMD is enabling the NNSA Tri-Lab community — LLNL and the Los Alamos and Sandia national laboratories — to achieve their mission critical objectives and contribute new AI advancements to the industry. We are extremely proud to continue our exascale work with HPE and NNSA and look forward to the delivery of the most powerful supercomputer in the world, expected in early 2023.”
El Capitan also will integrate many advanced features that are not yet widely deployed, including HPE’s advanced Cray Slingshot interconnect network, which will enable large calculations across many nodes, an essential requirement for the NNSA laboratories’ simulation workloads. In addition to the capabilities that Cray Slingshot provides, HPE and LLNL are partnering to actively explore new HPE optics technologies that integrate electrical-to-optical interfaces that could deliver higher data transmission at faster speeds with improved power efficiency and reliability. El Capitan also will feature the new Cray Shasta software platform, which will have a new container-based architecture to enable administrators and developers to be more productive, and to orchestrate LLNL’s complex new converged HPC and AI workflows at scale.
“As an industry and as a nation, we have achieved a major milestone in computing. HPE is honored to support DOE, NNSA and Lawrence Livermore National Laboratory in a critical strategic mission to advance the United States’ position in security and defense,” said Peter Ungaro, senior vice president and general manager, HPC and Mission Critical Systems (MCS), at HPE. “The computing power and capabilities of this system represent a new era of innovation that will unlock solutions to society’s most complex issues and answer questions we never thought were possible.”
The exascale ecosystem being developed through the sustained efforts of DOE’s Exascale Computing Initiative (ECI) will further ensure El Capitan has formidable capabilities from day one. Through funding from the NNSA’s ASC program, in collaboration with the DOE Office of Science’s Advanced Scientific Computing Research (ASCR) program, ongoing investments in hardware and software technology will assure highly functional hardware and tools to meet DOE’s needs in the next decade. The El Capitan system also will benefit from a partnership with Oak Ridge National Laboratory, which is taking delivery of a similar system from HPE about one year earlier than El Capitan.
El Capitan would not have been possible without the investments made by DOE’s Exascale PathForward program, which provided funding for American companies including HPE/Cray and AMD to accelerate the technologies necessary to maximize energy efficiency and performance of exascale supercomputers.
Besides supporting the nuclear stockpile, El Capitan will perform secondary national security missions, including nuclear nonproliferation and counterterrorism. NNSA laboratories are building machine learning and AI into computational techniques and analysis that will benefit NNSA’s primary missions and unclassified projects such as climate modeling and cancer research for DOE.