From WikiChip
Difference between revisions of "pezy/pezy-scx/pezy-sc"
< pezy‎ | pezy-scx

(Die Shot)
Line 23: Line 23:
 
|v core=1.0 V
 
|v core=1.0 V
 
|package module 1={{packages/pezy/fcbga-2112}}
 
|package module 1={{packages/pezy/fcbga-2112}}
|electrical=Yes
 
|packaging=Yes
 
|package 0=fcBGA-2112
 
|package 0 type=fcBGA
 
|package 0 pins=2112
 
|package 0 pitch=1 mm
 
|package 0 width=47.5 mm
 
|package 0 length=47.5 mm
 
|package 0 height=4.05 mm
 
|socket 0=BGA-2112
 
|socket 0 type=BGA
 
 
}}
 
}}
'''PEZY-SC''' ('''PEZY Super Computer''') is second generation [[many-core microprocessor]] developed by [[PEZY]] in 2014. PEZY-SC contains 2 {{armh|ARM926}} cores ({{arm|ARMv5TEJ}}) along with 1,024 simpler cores supporting 8-way [[simultaneous multithreading|SMT]] for a total of 8,192 [[logical core|threads]]. Operating at 733 MHz, the processor has a peak performance of 3.0 TFLOPS (single-precision) and 1.5 TFLOPS (double-precision). PEZY-SC was designed using 580 million gates and manufactured on [[28 nm process|TSMC's 28HPC+]]. The PEZY-SC is used in a number of [[TOP500]] & [[Green500]] supercomputers as the world's most efficient supercomputers.
+
'''PEZY-SC''' ('''PEZY Super Computer''') is a second generation [[many-core microprocessor]] developed by [[PEZY]] and introduced in 2014. This chip, which operates at 733 MHz, incorporates 1,024 cores dissipating 100 W. The PEZY-SC powers the [[ZettaScaler]]-1.x series of supercomputers. The PEZY-SC is used in a number of [[TOP500]] & [[Green500]] supercomputers as the world's most efficient supercomputers.
 +
 
 +
== Overview ==
 +
{{see also|pezy/pezy-1|l1=PEZY-1}}
 +
The PEZY-SC (SC for "Super Computer") is [[PEZY]]'s second generation microprocessors which builds upon the {{pezy|PEZY-1}}. The chip contains exactly twice as many cores and incorporates a large amount of cache including 8 MB of L3$. The chip contains 2 {{armh|ARM926}} cores ({{arm|ARMv5TEJ}}) along with 1,024 simpler cores supporting 8-way [[simultaneous multithreading|SMT]] for a total of 8,192 [[logical core|threads]]. Operating at 733 MHz, the processor has a peak performance of 3.0 TFLOPS (single-precision) and 1.5 TFLOPS (double-precision). PEZY-SC was designed using 580 million gates and manufactured on [[28 nm process|TSMC's 28HPC+]].  
 
{{#set:
 
{{#set:
 
| peak flops (single-precision) = {{#expr:733333333 * 4 * 1024}} FLOPS
 
| peak flops (single-precision) = {{#expr:733333333 * 4 * 1024}} FLOPS
 
| peak flops (double-precision) = {{#expr:733333333 * 2 * 1024}} FLOPS
 
| peak flops (double-precision) = {{#expr:733333333 * 2 * 1024}} FLOPS
 
}}
 
}}
 
== Overview ==
 
{{see also|pezy/pezy-1|l1=PEZY-1}}
 
The PEZY-SC (SC for "Super Computer") is [[PEZY]]'s second generation microprocessors which builds upon the {{pezy|PEZY-1}}. The chip contains exactly twice as many cores and incorporates a large amount of cache including 8 MB of L3$.
 
 
 
In June of 2015, PEZY-SC-based [[supercomputer]]s took all top 3 spots on the [[Green500]] listing as the 3 most efficient supercomputers. PEZY-SC powers [[Shoubu]] (1,181,952 cores, ? kW, 605.624 TFlop/s [[Linpack]] Rmax), and [[Suiren Blue]] (262,656 cores, 40.86 kW, 247.752 TFlop/s Linpack Rmax), and [[Suiren]] (328,480 cores, 48.90 kW, 271.782 TFlop/s Linpack Rmax) supercomputers (ranked 1, 2, and 3 respectively).
 
In June of 2015, PEZY-SC-based [[supercomputer]]s took all top 3 spots on the [[Green500]] listing as the 3 most efficient supercomputers. PEZY-SC powers [[Shoubu]] (1,181,952 cores, ? kW, 605.624 TFlop/s [[Linpack]] Rmax), and [[Suiren Blue]] (262,656 cores, 40.86 kW, 247.752 TFlop/s Linpack Rmax), and [[Suiren]] (328,480 cores, 48.90 kW, 271.782 TFlop/s Linpack Rmax) supercomputers (ranked 1, 2, and 3 respectively).
  

Revision as of 02:13, 3 November 2017

Template:mpu PEZY-SC (PEZY Super Computer) is a second generation many-core microprocessor developed by PEZY and introduced in 2014. This chip, which operates at 733 MHz, incorporates 1,024 cores dissipating 100 W. The PEZY-SC powers the ZettaScaler-1.x series of supercomputers. The PEZY-SC is used in a number of TOP500 & Green500 supercomputers as the world's most efficient supercomputers.

Overview

See also: PEZY-1

The PEZY-SC (SC for "Super Computer") is PEZY's second generation microprocessors which builds upon the PEZY-1. The chip contains exactly twice as many cores and incorporates a large amount of cache including 8 MB of L3$. The chip contains 2 ARM926 cores (ARMv5TEJ) along with 1,024 simpler cores supporting 8-way SMT for a total of 8,192 threads. Operating at 733 MHz, the processor has a peak performance of 3.0 TFLOPS (single-precision) and 1.5 TFLOPS (double-precision). PEZY-SC was designed using 580 million gates and manufactured on TSMC's 28HPC+.

In June of 2015, PEZY-SC-based supercomputers took all top 3 spots on the Green500 listing as the 3 most efficient supercomputers. PEZY-SC powers Shoubu (1,181,952 cores, ? kW, 605.624 TFlop/s Linpack Rmax), and Suiren Blue (262,656 cores, 40.86 kW, 247.752 TFlop/s Linpack Rmax), and Suiren (328,480 cores, 48.90 kW, 271.782 TFlop/s Linpack Rmax) supercomputers (ranked 1, 2, and 3 respectively).

Architecture

Further information: PEZY-SCx § Architecture

The PEZY-SC microprocessors is made of 4 blocks called "Prefectures". The Prefecture contains 2 MiB of L3$ enclosed by 16 smaller blocks called "Cities". Each City is made of 64 KiB of L2$, a number of special function units, and 4 smaller blocks called "Villages". A village is a block of 4 execution units. For every 2 execution units there is 2 KiB of L1D$.

pezy-sc arch.svg

Cache

PEZY-SC's cache is separate from the ARM926's cache which has an L1$ of 32 KiB (2x) and 64 KiB L2$ (shared).

[Edit/Modify Cache Info]

hierarchy icon.svg
Cache Organization
Cache is a hardware component containing a relatively small and extremely fast memory designed to speed up the performance of a CPU by preparing ahead of time the data it needs to read from a relatively slower medium such as main memory.

The organization and amount of cache can have a large impact on the performance, power consumption, die size, and consequently cost of the IC.

Cache is specified by its size, number of sets, associativity, block size, sub-block size, and fetch and write-back policies.

Note: All units are in kibibytes and mebibytes.
L1$3 MiB
3,072 KiB
3,145,728 B
L1I$2 MiB
2,048 KiB
2,097,152 B
1024x2 KiBper processor element 
L1D$1 MiB
1,024 KiB
1,048,576 B
512x2 KiBper 2 processor elements 

L2$4 MiB
4,096 KiB
4,194,304 B
0.00391 GiB
  4x2 MiBper citywrite-back

L3$8 MiB
8,192 KiB
8,388,608 B
0.00781 GiB
  4x2 MiBper prefecture 

Additionally, there is another 16 MiB consisting of 16 KiB per PE of scratch pad memory.

Memory controller

[Edit/Modify Memory Info]

ram icons.svg
Integrated Memory Controller
Max TypeDDR4-2133
Supports ECCYes
Controllers8
Channels8
Width64 bit
Max Bandwidth127.156 GiB/s
130,207.744 MiB/s
136.533 GB/s
136,532.715 MB/s
0.124 TiB/s
0.137 TB/s
Bandwidth
Single 15.89 GiB/s
Double 31.79 GiB/s
Quad 63.58 GiB/
Hexa 95.37 GiB/s
Octa 127.156 GiB/s

Expansions

[Edit/Modify Expansions Info]

ide icon.svg
Expansion Options
PCIeRevision: 2.0
Max Lanes: 32
Configuration: 4x8


Die Shot

pezy sc die shot.jpg


pezy-sc die shot (annotated).png

Floorplan

pezy-sc floorplan.png

External Links

Facts about "PEZY-SC - PEZY"
Has subobject
"Has subobject" is a predefined property representing a container construct and is provided by Semantic MediaWiki.
PEZY-SC - PEZY#pcie +
has ecc memory supporttrue +
l1$ size3,072 KiB (3,145,728 B, 3 MiB) +
l1d$ descriptionper 2 processor elements +
l1d$ size1,024 KiB (1,048,576 B, 1 MiB) +
l1i$ descriptionper processor element +
l1i$ size2,048 KiB (2,097,152 B, 2 MiB) +
l2$ descriptionper city +
l2$ size4 MiB (4,096 KiB, 4,194,304 B, 0.00391 GiB) +
l3$ descriptionper prefecture +
l3$ size8 MiB (8,192 KiB, 8,388,608 B, 0.00781 GiB) +
max memory bandwidth127.156 GiB/s (130,207.744 MiB/s, 136.533 GB/s, 136,532.715 MB/s, 0.124 TiB/s, 0.137 TB/s) +
max memory channels8 +
peak flops (double-precision)1,501,866,665,984 FLOPS (1,501,866,665.984 KFLOPS, 1,501,866.666 MFLOPS, 1,501.867 GFLOPS, 1.502 TFLOPS, 0.0015 PFLOPS, 1.501867e-6 EFLOPS, 1.501867e-9 ZFLOPS) +
peak flops (single-precision)3,003,733,331,968 FLOPS (3,003,733,331.968 KFLOPS, 3,003,733.332 MFLOPS, 3,003.733 GFLOPS, 3.004 TFLOPS, 0.003 PFLOPS, 3.003733e-6 EFLOPS, 3.003733e-9 ZFLOPS) +
supported memory typeDDR4-2133 +