| (8 intermediate revisions by 3 users not shown) | |||
| Line 1: | Line 1: | ||
{{pezy title|PEZY-SCnp}} | {{pezy title|PEZY-SCnp}} | ||
| − | {{ | + | {{chip |
|name=PEZY-SCnp | |name=PEZY-SCnp | ||
|image=pezy-scnp (front).png | |image=pezy-scnp (front).png | ||
| Line 30: | Line 30: | ||
|socket 0 type=BGA | |socket 0 type=BGA | ||
}} | }} | ||
| − | '''PEZY-SCnp''' (SC - '''Super Computer'''; np - '''New Package''') is a revised version of the {{pezy|PEZY-SC}} model by [[PEZY]] introduced in may of 2016. The new model uses a slightly larger package, lower core voltage, slightly higher core frequency, and thus higher performance. Operating at 766 MHz, the processor has a peak performance of 3.14 TFLOPS (single-precision) and | + | '''PEZY-SCnp''' (SC - '''Super Computer'''; np - '''New Package''') is a revised version of the {{pezy|PEZY-SC}} model by [[PEZY]] introduced in may of 2016. The new chip, which made use of a slightly different package in order to address a number of signal-related issues (DRAM/PCIe signal failures). The new model uses a slightly larger package, lower core voltage, slightly higher core frequency, and thus higher performance. Operating at 766 MHz, the processor has a peak performance of 3.14 [[TFLOPS]] (single-precision) and 1.57 TFLOPS (double-precision). PEZY also upgraded the connections from PCIe Gen2 to Gen3. As with the PEZY-SC, the SCnp is also manufactured on [[28 nm process|TSMC's 28HPC+]]. |
{{#set: | {{#set: | ||
| peak flops (single-precision) = {{#expr:766666666 * 4 * 1024}} FLOPS | | peak flops (single-precision) = {{#expr:766666666 * 4 * 1024}} FLOPS | ||
| Line 37: | Line 37: | ||
== Architecture == | == Architecture == | ||
| − | {{ | + | {{further|pezy/pezy-scx/pezy-sc#Architecture|pezy/pezy-scx#Architecture|l1=PEZY-SC § Architecture|l2=PEZY-SCx § Architecture}} |
The PEZY-SCnp's architecture is identical to the {{pezy|PEZY-SC}}. | The PEZY-SCnp's architecture is identical to the {{pezy|PEZY-SC}}. | ||
== Cache == | == Cache == | ||
PEZY-SC's cache is separate from the {{armh|ARM926}}'s cache which has an L1$ of 32 KiB (2x) and 64 KiB L2$ (shared). | PEZY-SC's cache is separate from the {{armh|ARM926}}'s cache which has an L1$ of 32 KiB (2x) and 64 KiB L2$ (shared). | ||
| + | {{cache size | ||
| + | |l1 cache=64 KiB | ||
| + | |l1i cache=32 KiB | ||
| + | |l1i break=2x16 KiB | ||
| + | |l1d cache=32 KiB | ||
| + | |l1d break=2x16 KiB | ||
| + | |l2 cache=64 KiB | ||
| + | |l2 break=1x64 KiB | ||
| + | }} | ||
| + | |||
| + | The chip integrates a multi-level cache hierarchy: | ||
{{cache size | {{cache size | ||
|l1 cache=3 MiB | |l1 cache=3 MiB | ||
| Line 60: | Line 71: | ||
|l3 policy= | |l3 policy= | ||
}} | }} | ||
| + | |||
| + | Additionally, there is another 16 MiB of scratch-pad memory consisting of 16 KiB per PE. | ||
== Memory controller == | == Memory controller == | ||
Latest revision as of 10:15, 22 September 2018
| Edit Values | ||||||||||
| PEZY-SCnp | ||||||||||
![]() | ||||||||||
| General Info | ||||||||||
| Designer | PEZY | |||||||||
| Manufacturer | TSMC | |||||||||
| Model Number | PEZY-SCnp | |||||||||
| Market | Supercomputer | |||||||||
| Introduction | May 6, 2016 (announced) May 6, 2016 (launched) | |||||||||
| General Specs | ||||||||||
| Family | PEZY-SCx | |||||||||
| Frequency | 766.66 MHz | |||||||||
| Microarchitecture | ||||||||||
| Process | 28 nm | |||||||||
| Technology | CMOS | |||||||||
| Cores | 1,024 | |||||||||
| Threads | 8,192 | |||||||||
| Electrical | ||||||||||
| Power dissipation | 100 W | |||||||||
| Power dissipation (average) | 70 W | |||||||||
| Vcore | 0.95 V | |||||||||
| Packaging | ||||||||||
| ||||||||||
PEZY-SCnp (SC - Super Computer; np - New Package) is a revised version of the PEZY-SC model by PEZY introduced in may of 2016. The new chip, which made use of a slightly different package in order to address a number of signal-related issues (DRAM/PCIe signal failures). The new model uses a slightly larger package, lower core voltage, slightly higher core frequency, and thus higher performance. Operating at 766 MHz, the processor has a peak performance of 3.14 TFLOPS (single-precision) and 1.57 TFLOPS (double-precision). PEZY also upgraded the connections from PCIe Gen2 to Gen3. As with the PEZY-SC, the SCnp is also manufactured on TSMC's 28HPC+.
Architecture[edit]
- Further information: PEZY-SC § Architecture and PEZY-SCx § Architecture
The PEZY-SCnp's architecture is identical to the PEZY-SC.
Cache[edit]
PEZY-SC's cache is separate from the ARM926's cache which has an L1$ of 32 KiB (2x) and 64 KiB L2$ (shared).
|
Cache Organization
Cache is a hardware component containing a relatively small and extremely fast memory designed to speed up the performance of a CPU by preparing ahead of time the data it needs to read from a relatively slower medium such as main memory. The organization and amount of cache can have a large impact on the performance, power consumption, die size, and consequently cost of the IC. Cache is specified by its size, number of sets, associativity, block size, sub-block size, and fetch and write-back policies. Note: All units are in kibibytes and mebibytes. |
|||||||||||||||||||||||||
|
|||||||||||||||||||||||||
The chip integrates a multi-level cache hierarchy:
|
Cache Organization
Cache is a hardware component containing a relatively small and extremely fast memory designed to speed up the performance of a CPU by preparing ahead of time the data it needs to read from a relatively slower medium such as main memory. The organization and amount of cache can have a large impact on the performance, power consumption, die size, and consequently cost of the IC. Cache is specified by its size, number of sets, associativity, block size, sub-block size, and fetch and write-back policies. Note: All units are in kibibytes and mebibytes. |
|||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||
Additionally, there is another 16 MiB of scratch-pad memory consisting of 16 KiB per PE.
Memory controller[edit]
|
Integrated Memory Controller
|
||||||||||||||
|
||||||||||||||
Expansions[edit]
Expansion Options |
|||||
|
|||||
| Has subobject "Has subobject" is a predefined property representing a container construct and is provided by Semantic MediaWiki. | PEZY-SCnp - PEZY#package + and PEZY-SCnp - PEZY#pcie + |
| base frequency | 766.66 MHz (0.767 GHz, 766,660 kHz) + |
| core count | 1,024 + |
| core voltage | 0.95 V (9.5 dV, 95 cV, 950 mV) + |
| designer | PEZY + |
| family | PEZY-SCx + |
| first announced | May 6, 2016 + |
| first launched | May 6, 2016 + |
| full page name | pezy/pezy-scx/pezy-scnp + |
| has ecc memory support | true + |
| instance of | microprocessor + |
| l1$ size | 64 KiB (65,536 B, 0.0625 MiB) + and 3,072 KiB (3,145,728 B, 3 MiB) + |
| l1d$ description | per 2 processor elements + |
| l1d$ size | 32 KiB (32,768 B, 0.0313 MiB) + and 1,024 KiB (1,048,576 B, 1 MiB) + |
| l1i$ description | per processor element + |
| l1i$ size | 32 KiB (32,768 B, 0.0313 MiB) + and 2,048 KiB (2,097,152 B, 2 MiB) + |
| l2$ description | per city + |
| l2$ size | 0.0625 MiB (64 KiB, 65,536 B, 6.103516e-5 GiB) + and 4 MiB (4,096 KiB, 4,194,304 B, 0.00391 GiB) + |
| l3$ description | per prefecture + |
| l3$ size | 8 MiB (8,192 KiB, 8,388,608 B, 0.00781 GiB) + |
| ldate | May 6, 2016 + |
| main image | + |
| manufacturer | TSMC + |
| market segment | Supercomputer + |
| max memory bandwidth | 127.156 GiB/s (130,207.744 MiB/s, 136.533 GB/s, 136,532.715 MB/s, 0.124 TiB/s, 0.137 TB/s) + |
| max memory channels | 8 + |
| model number | PEZY-SCnp + |
| name | PEZY-SCnp + |
| package | FCBGA-2397 + |
| peak flops (double-precision) | 1,570,133,331,968 FLOPS (1,570,133,331.968 KFLOPS, 1,570,133.332 MFLOPS, 1,570.133 GFLOPS, 1.57 TFLOPS, 0.00157 PFLOPS, 1.570133e-6 EFLOPS, 1.570133e-9 ZFLOPS) + |
| peak flops (single-precision) | 3,140,266,663,936 FLOPS (3,140,266,663.936 KFLOPS, 3,140,266.664 MFLOPS, 3,140.267 GFLOPS, 3.14 TFLOPS, 0.00314 PFLOPS, 3.140267e-6 EFLOPS, 3.140267e-9 ZFLOPS) + |
| power dissipation | 100 W (100,000 mW, 0.134 hp, 0.1 kW) + |
| power dissipation (average) | 70 W (70,000 mW, 0.0939 hp, 0.07 kW) + |
| process | 28 nm (0.028 μm, 2.8e-5 mm) + |
| supported memory type | DDR4-2133 + |
| technology | CMOS + |
| thread count | 8,192 + |
