From WikiChip
Difference between revisions of "zettascaler"

(ZettaScaler-2.2)
m (ZettaScaler-2.2)
Line 49: Line 49:
 
The '''ZettaScaler-2.2''' ('''ZS-2.2''') architecture was introduced to the Top500 list in November 2017 with multiple systems being upgraded.  
 
The '''ZettaScaler-2.2''' ('''ZS-2.2''') architecture was introduced to the Top500 list in November 2017 with multiple systems being upgraded.  
  
The highest-performing system was the Gyoukou supercomputer which officially peaked at an efficiency of 14.173 GFLOPS/W, surpassing the previous record of 14.11 held by the [[TSUBAME 3.0]]. The systems features 10,000 {{pezy|PEZY-SC2}} chips, however in order to improve yield they each have 1,984 cores enabled instead of the expect 2,048 for a total of 19,840,000 cores. In the ZS-2.2, the SC2 chips are clocked at 700 MHz instead of their announced frequency of 1 GHz. For every eight SC2 chips, there is a single [[16-core]] [[Xeon D]] processor for a combined total of 1250 Xeon processors and 20,000 cores. The Gyoukou ZS-2.2 has a Linpack performance of 19,135.8 TFLOPS with a theoretical peak performance of 28.192.0 PFLOPS consuming 1,350 kW. With 7,056 PEZY-SC2 chips, only 19.53 immersion tanks are filled out of the 26 tanks available, leaving a lot of room to grow. Additionally, with 19,860,000 [[physical core|cores]]  (19,840,000 PEZY + 20,000 [[x86]]), the ZettaScaler-2.2 surpassed the Chinese supercomputer [[Sunway TaihuLight]] as the highest core-count supercomputer by over 3 million cores.
+
The highest-performing system was the Gyoukou supercomputer which officially peaked at an efficiency of 14.173 GFLOPS/W, surpassing the previous record of 14.11 held by the [[TSUBAME 3.0]]. The systems features 10,000 {{pezy|PEZY-SC2}} chips, however in order to improve yield they each have 1,984 cores enabled instead of the expected 2,048 for a total of 19,840,000 cores. In the ZS-2.2, the SC2 chips are clocked at 700 MHz instead of their announced frequency of 1 GHz. For every eight SC2 chips, there is a single [[16-core]] [[Xeon D]] processor for a combined total of 1250 Xeon processors and 20,000 cores. The Gyoukou ZS-2.2 has a Linpack performance of 19,135.8 TFLOPS with a theoretical peak performance of 28.192.0 PFLOPS consuming 1,350 kW. With 7,056 PEZY-SC2 chips, only 19.53 immersion tanks are filled out of the 26 tanks available, leaving a lot of room to grow. Additionally, with 19,860,000 [[physical core|cores]]  (19,840,000 PEZY + 20,000 [[x86]]), the ZettaScaler-2.2 surpassed the Chinese supercomputer [[Sunway TaihuLight]] as the highest core-count supercomputer by over 3 million cores.
  
 
In addition to the Gyoukou supercomputer, three other smaller Japanese supercomputers based on the ZettaScaler-2.2 were introduced - all three beating the Gyoukou in efficiency, reaching the top three spot on the Green500 list:
 
In addition to the Gyoukou supercomputer, three other smaller Japanese supercomputers based on the ZettaScaler-2.2 were introduced - all three beating the Gyoukou in efficiency, reaching the top three spot on the Green500 list:

Revision as of 08:45, 14 November 2017

ZettaScaler is a series of Japanese supercomputers using processors designed by PEZY and liquid cooling systems designed by ExaScaler.

Overview

ZettaScaler supercomputers are constructed using dense server aggregates called 'Bricks'. The system is cooled using enclosures containing a number of bricks. The cooling system is designed by ExaScaler and makes use of 3M's Fluorinert Electronic liquid which is an electrically insulating fluorocarbon-based inert liquid.

ZettaScalers use a host CPU (typically a Xeon E5) which is then used to offload work to the PEZY-SCx processors which act as GPDSP/GPGPU accelerators using OpenCL-like programming called PZCL.

ZettaScaler-1.x

ZettaScaler-1.x (ZS-1.x) is based on the PEZY-SC. Each node (brick) was constructed using two Xeon E5 connected using Intel's QPI along with 4 set of dual PEZY-SC chips in PCIe card slots using InfiniBand FDR HCA. The use of the general-purpose Intel chips was to reduce cost and speedup development cycle. The original ZS-1.0 used an off-the-shelf SuperMicro board in order to speed-up the development cycle. Starting with the ZS-1.4, PEZY moved to a custom "brick".

With the introduction of the ZS-1.6, PEZY redesigned the package of the PEZY-SC. The new chip, PEZY-SCnp (NP for New Package), addressed a number of issues relating to signal quality (DRAM, and PCIe frequency failure).

System ZettaScaler-1.0 ZettaScaler-1.4 ZettaScaler-1.5 ZettaScaler-1.6
Introduction October 2014 April 2015 October 2015 April 2016
Board SuperMicro Board Brick
Host 2x Xeon E5-2660 v2 2x Xeon E5-2618L v3
Chip PEZY-SC PEZY-SCnp
Memory DDR3 256 GB DDR4 64 GB DDR4 128 GB
TOP500 Suiren:
187 TFLOPS (Rank 369) 2014/11
206 TFLOPS (Rank 366) 2015/06
Shoubu:
412 TFLOPS (Rank 169) 2015/06
SuirenBlue:
194 TFLOPS (Rank 392) 2015/06
Shoubu:
1,001 TFLOPS (Rank 94) 2016/06
Satsuki: 290 TFLOPS (Rank 486) 2016/06
Green500 Suiren:
4.95 GFLOPS/W (Rank 2) 2014/11
6.22 GFLOPS/W (Rank 3)
Shoubu:
7.03 GFLOPS/W (Rank 1) 2015/06
SuirenBlue: 6.84 GFLOPS/W (Rank 2) 2015/06
Shoubu:
6.67 GFLOPS/W (Rank 1) 2016/06
Satsuki 6.20 GFLOPS/W (Rank 2) 2016/06

ZettaScaler-2.x

With the introduction of the ZettaScaler-2.x, PEZY has redesigned the Brick. PEZY moved to higher integration through advanced 3D packaging technology using ThruChip Interface (TCI) to interconnect the DRAM to the processor which uses wireless near-field inductive coupling instead of the traditional TSV. Powering the ZS-2.x computers is the PEZY-SC2 which incorporates 2,048 cores.

Module:
pezy-sc2 module.png
Brick:
pezy-sc2 brick.png

On the ZS-2.x, a single module contains a beefier 48V power supply. The module is measured 160mm (W) x 100mm (D) x 28 mm (H). Each brick contains 32 modules.

zettascaler tanks.png

16 such bricks go into a single tank.

ZettaScaler-2.2

ZS-2.2 Enclosure

The ZettaScaler-2.2 (ZS-2.2) architecture was introduced to the Top500 list in November 2017 with multiple systems being upgraded.

The highest-performing system was the Gyoukou supercomputer which officially peaked at an efficiency of 14.173 GFLOPS/W, surpassing the previous record of 14.11 held by the TSUBAME 3.0. The systems features 10,000 PEZY-SC2 chips, however in order to improve yield they each have 1,984 cores enabled instead of the expected 2,048 for a total of 19,840,000 cores. In the ZS-2.2, the SC2 chips are clocked at 700 MHz instead of their announced frequency of 1 GHz. For every eight SC2 chips, there is a single 16-core Xeon D processor for a combined total of 1250 Xeon processors and 20,000 cores. The Gyoukou ZS-2.2 has a Linpack performance of 19,135.8 TFLOPS with a theoretical peak performance of 28.192.0 PFLOPS consuming 1,350 kW. With 7,056 PEZY-SC2 chips, only 19.53 immersion tanks are filled out of the 26 tanks available, leaving a lot of room to grow. Additionally, with 19,860,000 cores (19,840,000 PEZY + 20,000 x86), the ZettaScaler-2.2 surpassed the Chinese supercomputer Sunway TaihuLight as the highest core-count supercomputer by over 3 million cores.

In addition to the Gyoukou supercomputer, three other smaller Japanese supercomputers based on the ZettaScaler-2.2 were introduced - all three beating the Gyoukou in efficiency, reaching the top three spot on the Green500 list:

Top500/Green500 November 2017 Ranking
Green500
Rank
Top500
Rank
System Cores Rmax Power Efficiency
1 259 Shoubu system B 794,400 842.0 TFLOPS 50 kW 17.009 GFLOPS/W
50x (8x PEZY-SC2 + 1x Xeon D-1571)
2 307 Suiren2 762,624 788.2 TFLOPS 47 kW 16.759 GFLOPS/W
48x (8x PEZY-SC2 + 1x Xeon D-1571)
3 276 Sakura 794,400 824.7 TFLOPS 50 kW 16.657 GFLOPS/W
50x (8x PEZY-SC2 + 2x Xeon E5-2618L v3)
5 4 Gyoukou 19,860,000 19,135.8 TFLOPS 1,350 kW 14.173 GFLOPS/W
1250x (8x PEZY-SC2 + 1x Xeon D-1571)

References