
Intel Delivers 10,000 Aurora Supercomputer Blades, Benchmarks Against Nvidia and AMD
With two exaflops of efficiency, the Intel-powered Aurora supercomputer is anticipated to beat the AMD-powered Frontier supercomputer, presently the quickest on the earth, and topped the Prime 500 listing of quickest supercomputers. Nonetheless, on account of Intel’s ongoing delays in offering the {hardware}, Aurora was unable to make the listing introduced as we speak because it has not but submitted a benchmark to the Prime 500 committee. Intel as we speak shared new particulars concerning the system and introduced at its ISC convention that it has supplied greater than 10,000 working wings for the Aurora supercomputer; actual blades required for full deployment. We’ll cowl the small print beneath.
However Intel says the system might be totally operational later this yr, and has shared head-to-head benchmarks with Aurora in opposition to AMD and Nvidia-powered supercomputers, with a 2x efficiency benefit over AMD’s MI250X GPUs and a 20% over Nvidia’s H100. Claims revenue. GPUs.
Intel says it provides silicon to the Argonne Management Computing Facility (ALCF) for greater than 10,000 blades, each fourth-generation Sapphire Rapids Xeon chips and Ponte Vecchio GPUs.
Nonetheless, the Aurora is designed to work with Intel’s continuously delayed HBM-equipped Sapphire Rapids “Xeon Max” chips. On account of these delays, Intel initially began transport non-HBM Sapphire Rapids chips to ALCF, and the ability started populating the Aurora with customary non-HBM Sapphire Rapids chips as a brief measure.
Intel is now offering the ALCF with quicker HBM-equipped Xeon Max chips, however not all the 10,000 blades it claims to be delivered have Max chips. We requested Intel and firm representatives confirmed that not all wings are geared up with the last word Xeon Max silicon. The corporate tells us that about 75% of the blades comprise the newest Xeon Max revision of silicon. Presumably, that is the bottleneck that stops the system from providing a benchmark for the Top500 listing.
The system consists of 166 cupboards, 64 blades per rack, for a complete of 10,624 blades, so the greater than 10,000 blades delivered are adequate for the system to function – not at full efficiency.
Intel has additionally shared extra specs for the Aurora supercomputer, together with detailed specs you possibly can see within the slide above. With 21,248 CPUs and 63,744 Ponte Vecchio GPUs, the Aurora will both catch or exceed two exaflops of efficiency when totally on-line earlier than the tip of the yr. The system additionally has 10.9 petabytes (PB) of DDR5 reminiscence, 1.36 PB of HBM hooked up to CPUs, 8.16 PB of GPU reminiscence, and 230 PB of storage capability offering 31 TB/s of bandwidth (different fascinating particulars, scroll up).
Intel additionally introduced that Aurora will start operating productive AI workloads throughout a spread of workloads. The ‘Aurora GPT’ main language mannequin might be science pushed and could have 1 trillion parameters with Megatron and DeepSpeed foundations. Intel supplied the next abstract of the mission:
“These generative AI fashions for science might be skilled on normal textual content, code, scientific texts and structured scientific knowledge from biology, chemistry, supplies science, physics, medication and different sources. The ensuing fashions (with as much as 1 trillion parameters) techniques biology, polymer chemistry and vitality supplies can be utilized in a wide range of scientific purposes, from the design of molecules and supplies to the synthesis of data from tens of millions of sources to suggest new and fascinating experiments within the fields of local weather science and cosmology. It may be used to speed up the identification of organic processes associated to most cancers and different illnesses and to suggest targets for drug design.”
Intel additionally ran a number of benchmark checks from the Sunspot system, a smaller two-rack model of the Aurora with a complete of 128 nodes. Intel in contrast Sunspot’s efficiency with estimated numbers representing the ‘equally sized’ Polaris supercomputer with Nvidia A100 GPU and the Crusher supercomputer powered by AMD’s MI250X GPUs. Sadly, Intel didn’t present take a look at notes or particulars of those configurations, so take the outcomes with extra skepticism than ordinary.
In a single node take a look at at a reactor estimated workload, Intel claims its system is 45% quicker than its Nvidia competitor and 12% quicker than its AMD system. Returning to the scalability benchmarks, Intel claims that by normalizing the overall variety of GPUs utilized in take a look at techniques to 96 GPUs (AMD and Nvidia nodes every have 4 GPUs, the Intel system has six GPUs per node), it offers greater than double the Sunspot. Efficiency of each AMD and Nvidia techniques in a Monte Carlo workload. For 90 nodes within the NWChemEx workload, Intel claims it is 72% quicker than the 90-node Nvidia-powered Solaris system.
The Aurora supercomputer was first introduced in 2015 and has an estimated completion date of 2018. On the time, the system was designed to make use of Knights Hill processors, which have been later discontinued. The system has seen quite a few redesigns and replanning, with the brand new Aurora introduced in 2019 and one exaflop delivered in 2021. One more rescheduling in late 2021 claimed that the system would ship two exaflops when full. now scheduled for this yr.
The lengthy and winding highway continues, however ultimately it appears the tip is not less than in sight. Intel says it should ship all Xeon Max processors to finish the system quickly, and can submit the system’s completion and first 500 evaluations earlier than the tip of the yr.
#Intel #Delivers #Aurora #Supercomputer #Blades #Benchmarks #Nvidia #AMD