(fixed typo) |
|||
Line 4: | Line 4: | ||
| no image = | | no image = | ||
| image = pezy sc.jpg | | image = pezy sc.jpg | ||
− | | image size = | + | | image size = 300px |
| caption = | | caption = | ||
| designer = PEZY | | designer = PEZY | ||
Line 21: | Line 21: | ||
| frequency = 733.33 MHz | | frequency = 733.33 MHz | ||
| bus type = | | bus type = | ||
− | | bus speed = | + | | bus speed = |
| bus rate = | | bus rate = | ||
− | | clock multiplier = | + | | clock multiplier = |
| microarch = | | microarch = | ||
Line 93: | Line 93: | ||
The PE are the individual execution cores. | The PE are the individual execution cores. | ||
{{expand section}} | {{expand section}} | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
== Cache == | == Cache == | ||
PEZY-SC's cache is separate from the {{armh|ARM926}}'s cache which has an L1$ of 32 KiB (2x) and 64 KiB L2$ (shared). | PEZY-SC's cache is separate from the {{armh|ARM926}}'s cache which has an L1$ of 32 KiB (2x) and 64 KiB L2$ (shared). | ||
− | {{cache | + | {{cache size |
+ | |l1 cache=3 MiB | ||
|l1i cache=2 MiB | |l1i cache=2 MiB | ||
|l1i break=1024x2 KiB | |l1i break=1024x2 KiB | ||
− | |l1i | + | |l1i desc=per processor element |
|l1d cache=1 MiB | |l1d cache=1 MiB | ||
|l1d break=512x2 KiB | |l1d break=512x2 KiB | ||
− | |l1d | + | |l1d desc=per 2 processor elements |
+ | |l1d policy= | ||
|l2 cache=4 MiB | |l2 cache=4 MiB | ||
|l2 break=4x2 MiB | |l2 break=4x2 MiB | ||
− | |l2 | + | |l2 desc=per city |
+ | |l2 policy=write-back | ||
|l3 cache=8 MiB | |l3 cache=8 MiB | ||
|l3 break=4x2 MiB | |l3 break=4x2 MiB | ||
− | |l3 | + | |l3 desc=per prefecture |
+ | |l3 policy= | ||
}} | }} | ||
== Memory controller == | == Memory controller == | ||
− | {{ | + | {{memory controller |
− | | type | + | |type=DDR4-1333 |
− | | controllers | + | |ecc=Yes |
− | | channels | + | |max mem= |
− | | | + | |controllers=1 |
− | | bandwidth schan | + | |channels=8 |
− | | bandwidth dchan | + | |max bandwidth=84,800 GiB/s |
− | | bandwidth qchan | + | |bandwidth schan=9.934 GiB/s |
− | | bandwidth ochan | + | |bandwidth dchan=19.868 GiB/s |
− | + | |bandwidth qchan=39.736 GiB/s | |
+ | |bandwidth ochan=79.472 GiB/s | ||
}} | }} | ||
== Expansions == | == Expansions == | ||
− | {{ | + | {{expansions |
| pcie revision = 2.0 | | pcie revision = 2.0 | ||
| pcie lanes = 8 | | pcie lanes = 8 | ||
− | | pcie config = | + | | pcie config = 1x8 |
− | | pcie config | + | | pcie config 2 = 2x4 |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
| uart = Yes | | uart = Yes | ||
| gp io = Yes | | gp io = Yes | ||
}} | }} | ||
+ | |||
+ | == Die Shot == | ||
+ | * [[28 nm process]] | ||
+ | * 19.5 mm × 21.1 mm | ||
+ | * 411.6 mm² die size | ||
+ | [[File:pezy sc die shot.jpg|650px]] | ||
+ | |||
+ | |||
+ | [[File:pezy-sc die shot (annotated).png|650px]] | ||
== External Links == | == External Links == | ||
Line 151: | Line 152: | ||
* [https://www.top500.org/system/178542 Shoubu Supercomputer on TOP500] | * [https://www.top500.org/system/178542 Shoubu Supercomputer on TOP500] | ||
* [https://www.top500.org/system/178540 Suiren Blue on TOP500] | * [https://www.top500.org/system/178540 Suiren Blue on TOP500] | ||
− | |||
− |
Revision as of 23:05, 28 April 2017
Template:mpu PEZY-SC (PEZY Super Computer) is second generation many-core microprocessor developed by PEZY in 2014. PEZY-SC contains 2 ARM926 cores (ARMv5TEJ) along with 1024 simpler RISC cores. Operating at 733 MHz, the processor is said to have peak performance of 3.0 TFLOPS (single-precision) and 1.5 TFLOPS (double-precision). PEZY-SC was designed using 580 million gates and manufactured on TSMC's 28HPC+ (28 nm process). The PEZY-SC is used in a number of TOP500 & Green500 supercomputers as the world's most efficient supercomputers.
Contents
Overview
- See also: PEZY-1
The PEZY-SC (SC for "Super Computer") is PEZY's second generation microprocessors which builds upon the PEZY-1. The chip contains exactly twice as many cores and incorporates a large amount of cache including 8 MB of L3$.
In June of 2015, PEZY-SC-based supercomputers took all top 3 spots on the Green500 listing as the 3 most efficient supercomputers. PEZY-SC powers Shoubu (1,181,952 cores, ? kW, 605.624 TFlop/s Linpack Rmax), and Suiren Blue (262,656 cores, 40.86 kW, 247.752 TFlop/s Linpack Rmax), and Suiren (328,480 cores, 48.90 kW, 271.782 TFlop/s Linpack Rmax) supercomputers (ranked 1, 2, and 3 respectively).
Architecture
The PEZY-SC microprocessors is made of 4 blocks called "Prefectures". The Prefecture contains 2 MB of L3$ enclosed by 16 smaller blocks called "Cities". Each City is made of 64 KB of L2$, a number of special function units, and 4 smaller blocks called "Villages". A village is a block of 4 execution units. For ever 2 execution units there are 2 KB of L1d$.
Processor Element (PE)
The PE are the individual execution cores.
This section requires expansion; you can help adding the missing info. |
Cache
PEZY-SC's cache is separate from the ARM926's cache which has an L1$ of 32 KiB (2x) and 64 KiB L2$ (shared).
Cache Organization
Cache is a hardware component containing a relatively small and extremely fast memory designed to speed up the performance of a CPU by preparing ahead of time the data it needs to read from a relatively slower medium such as main memory. The organization and amount of cache can have a large impact on the performance, power consumption, die size, and consequently cost of the IC. Cache is specified by its size, number of sets, associativity, block size, sub-block size, and fetch and write-back policies. Note: All units are in kibibytes and mebibytes. |
|||||||||||||||||||||||||||||||||||||
|
Memory controller
Integrated Memory Controller
|
||||||||||||
|
Expansions
Expansion Options
|
||||||||||||
|
Die Shot
- 28 nm process
- 19.5 mm × 21.1 mm
- 411.6 mm² die size
External Links
Has subobject "Has subobject" is a predefined property representing a container construct and is provided by Semantic MediaWiki. | PEZY-SC - PEZY#io + |
has ecc memory support | true + |
l1$ size | 3,072 KiB (3,145,728 B, 3 MiB) + |
l1d$ description | per 2 processor elements + |
l1d$ size | 1,024 KiB (1,048,576 B, 1 MiB) + |
l1i$ description | per processor element + |
l1i$ size | 2,048 KiB (2,097,152 B, 2 MiB) + |
l2$ description | per city + |
l2$ size | 4 MiB (4,096 KiB, 4,194,304 B, 0.00391 GiB) + |
l3$ description | per prefecture + |
l3$ size | 8 MiB (8,192 KiB, 8,388,608 B, 0.00781 GiB) + |
max memory bandwidth | 84,800 GiB/s (86,835,200 MiB/s, 91,053.307 GB/s, 91,053,306.675 MB/s, 82.813 TiB/s, 91.053 TB/s) + |
max memory channels | 8 + |
max pcie lanes | 8 + |
supported memory type | DDR4-1333 + |