ARM-Based Fugaku Supercomputer Now World’s Fastest Supercomputer
Japan Holds Supercomputer Performance Crown with Massive Cluster Of Fujitsu A64FX ARM Processors
The Fugaku supercomputer located in Kobe, Japan and developed jointly by RIKEN and Fujitsu Limited recently took the top spot in several supercomputer rankings making it the first time since June 2011 that Japan has held the Top500 supercomputer list crown and the first time ever that a supercomputer has simultaneously hit the HPCG, HPL-AI, and Graph500 world records. Fugaku also sits at #9 on the Green500 list putting it between the IBM Power9-based Summit in the US and Marconi-100 in Italy.
The supercomputer, a CPU-only project utilizing the ARM architecture, will be comprised of 158,976 nodes and is located at the RIKEN Center for Computational Science (R-CCS). Ten years in the making and six since initial construction, the Fugaku supercomputer is slated for full operation beginning April 2021. It is intended to further Japan’s “Society 5.0” plan and will be used to tackle the big societal and scientific problems facing the country and the world including simulating natural disasters, forecasting weather and climate change, drug research into personalized and preventative medicines, research into clean energy creation, use, and storage, new materials, and new production and design processes. Further, RIKEN notes that Fugaku will also be used to answer the greater scientific curiosities of the universe and clarity on the fundamental laws and evolution of the universe.
The Fugaku supercomputer is currently running 152,064 nodes with each compute node featuring a Fujitsu-designed A64FX 48 core processor and 32GB of HBM2 memory bringing the total to 7,299,072 cores and 4,866,048 GB of memory. Each compute node has a Tofu interconnect (28 Gbps x 2 lanes x 10 ports) providing up to 560 Gbps of inter-node bandwidth. There are also 16 PCI-E 3.0 lanes for connecting to GPUs, FPGAs, or other accelerator cards or I/O. The A64FX uses the ARMv8.2-A architecture and SVE (Scalable Vector Extension) SIMD instruction set in a 512-bit wide implementation. There are 48 cores for compute and 4 cores for assistance (assigning work, inter-CMG and inter-node communication, host OS tasks, etc) clocked at 2.2 GHz with those cores being broken down into four groupings each with 12 compute cores, 1 assistance core, 8MB of L2 cache, and 8GB of HBM2 memory with 256 GB/s of bandwidth. The processors are built on TSMC’s N7 7nm node and have 8.786 billion transistors. According to Fujitsu, the A64FX has extensive RAS features with error detection and correction throughout the chip. Each processor/node is rated at more than 2.7 TFLOPS peak performance.
Speaking of performance, Fugaku sits at #1 on the Top500 list with 415,530 Rmax and 513,854.7 Rpeak TFLOPS ratings along with a whopping 28.335 MW power rating putting it on another level versus the next fastest supercomputer, Summit in the US, at 10.096 MW, less than a third of the number of cores, and less than half the performance (at least in the FP64 workload measured for the TOp500). Fugaku scored 13,400 TFLOPS in HPCG using 138,240 nodes, 1.421 ExaFLOPS in HPL-AI with 126,720 nodes (the world’s first exascale rating), and 70,990 gigaTEPS in Graph500 which was a “breadth-first search” of a massive graph with 1.1 million nodes and 11.6 trillion edges in 0.25 seconds. The Graph500 performance is significantly higher than Fugaku’s predecessor the K at 31,303 gigaTEPS and the 23,756 gigaTEPS of the Chinese Sunway TaihuLight.
AnandTech further notes that when looking beyond FP64 performance, the numbers get even more crazy with 1.07 ExaOPS FP32, 2.15 ExaOPS FP16, and 4.30 ExaOPS of INT8 performance.
With the US-based Sandia National Laboratories planning an A64FX-based supercomputer of its own as well as Exascale supercomputers in the works (like the US DOE’s 2 ExaFLOPS El Capitan planned for 2023) around the world Fugaku may not reign for long, but for now Japan holds the performance crown. Currently Fugaku is being used experimentally to assist with COVID-19 research in search of therapeutic drugs for effective treatments, simulating the spread of the virus, and calculating the effectiveness of Japan’s contact tracing app finding that at least 60% of the population needs to use the app in order for it to be effective. Full operation of Fugaku is slated for fiscal 2021 (beginning next April).
According to Satoshi Matsuoka, director of RIKEN R-CCS:
“Ten years after the initial concept was proposed, and six years after the official start of the project, Fugaku is now near completion. Fugaku was developed based on the idea of achieving high performance on a variety of applications of great public interest, such as the achievement of Society 5.0, and we are very happy that it has shown itself to be outstanding on all the major supercomputer benchmarks. In addition to its use as a supercomputer, I hope that the leading-edge IT developed for it will contribute to major advances on difficult social challenges such as COVID-19.”
What! I can’t get that processing power out of my head, it’s more than is possible to almost equate. No doubt Google are cooking one up.
Amazing BC