From WikiChip
POWER9 - Microarchitectures - IBM
< ibm
Revision as of 21:51, 4 February 2017 by At32Hz (talk | contribs) (Pipeline)

Edit Values
POWER9 µarch
General Info
Arch TypeCPU
DesignerIBM
ManufacturerGlobalFoundries
IntroductionAugust, 2017
Phase-outAugust, 2018
Process14 nm
Core Configs24
Pipeline
TypeSuperscalar
SpeculativeYes
Reg RenamingYes
Stages12-16
Instructions
ISAPower ISA v3.0
Cache
L1I Cache32 KiB/core
L1D Cache32 KiB/core
L2 Cache512 KiB/core
L3 Cache120 MiB/chip
Succession

POWER9 is IBM's successor to POWER8, a 14 nm microarchitecture for Power-based server microprocessors that is set to be introduced in the 2nd half of 2017. POWER9-based processors are branded under the POWER9 family.

Process Technology

POWER9 is set to be fabricated on GlobalFoundries' 14 nm FinFET process, the same process that's used by AMD for their Zen microarchitecture.

Compatibility

Initial support for POWER9 started with Linux Kernel 4.8.

Vendor OS Version Notes
IBM AIX 7.? Support
IBM i  ? Support
Linux Linux Kernel 4.8 Initial Support
Wind River VxWorks VxWorks 7.? Support

Compiler support

Compiler CPU Arch-Favorable
GCC -mcpu=pwr9 -mtune=pwr9
LLVM -mcpu=pwr9 -mtune=pwr9
XL C/C++ -mcpu=pwr9 -mtune=pwr9

Variations

IBM offers POWER9 in two flavors: Scale-Out (SO) and Scale-Up (SU). The Scale-Out variations are design for traditional datacenter clusters utilizing single- and -dual sockets setups. The Scale-Up variations are designed for NUMA servers with four sockets and up, supporting large memory and throughput.

For both the Scale-Out and the Scale-Up there are two variations, a 12-core SMT8 model and a 24-core SMT4 model. The SMT4 is optimized for Linux Ecosystem whereas the SMT8 is said to be optimized for the PowerVM Ecosystem community (AIX / IBM i customers). Those models support up to 8 channels of DDR4 memory for up to 128 GiB of memory.

Linux Ecosystem PowerVM Ecosystem
24-core / 96 Threads 12-core / 96 Threads
Scale-Out (SO) p9sosmt4.png p9sosmt8.png
Scale-Up (SU) p9susmt4.png p9susmt8.png

Architecture

Key changes from POWER8

  • 14 nm process (from 22 nm)
    • 17-layer metal stack
    • 8,000,000,000 transistors
  • Support for Power ISA v3.0
  • Higher single-thread performance
  • New highly modular architecture
  • Shorter pipeline
    • 5 stages eliminated from fetch to compute vs POWER8
  • Cache
    • 120 MiB NUCA L3
      • eDRAM
      • 7 TB/s on-chip bandwidth
  • Hardware Acceleration
  • I/O Subsystem
    • PCIe Gen4
    • Local SMP - 16 GT/s per lane interface
    • Remote SMP - 25 GT/s per lane interface
      • 48-96 lanes capability
      • IBM's SMP connect for their scale-up systems
      • Also available for the accelerators
  • Virtualization
    • QoS assistance
    • New Interrupt architecture
    • Workload-optimized frequency
    • Hardware enforced trusted execution

Execution Slice Microarchitecture

Execution Slice Microarchitecture is POWER9's entirely new refactored core modular design. The same modules were used to build both the SMT4 and SMT8 cores (and in theory scale further to higher thread count although that's not going to happen in this iteration). These modules allow IBM to address the various processor models with support for the different configurations such as bandwidth/lines (from 128 to 64 byte sectors).

A Slice is the basic 64-bit computing block incorporating a single Vector and Scalar Unit (VSU) coupled with Load/Store Unit (LSU). VSU has a heterogeneous mix of computing capabilities including integer and floating point supporting scalar and vector operations. IBM claims this setup allows for higher utilization of resources while providing efficient exchanges of data between the individual slices. Two slices coupled together make up the Super-Slice, a 128-bit POWER9 physical design building block. Two super-slices together along with an Instruction Fetch Unit (IFU) and an Instruction Sequencing Unit (ISU) form a single POWER9 SMT4 core. The SMT8 variant is effectively two SMT4 units.

POWER8 P9 SMT8 (4x Super-Slice) P9 SMT4 (2x Super-Slice) Super-Slice Slice
p8smt8comp.png p94xsuper-slice.png p92xsuper-slice.png p9super-slice.png p9slice.png

Pipeline

New text document.svg This section is empty; you can help add the missing info by editing this page.

Die Shot

Tetracosa-Core

800px

800px

See also

codenamePOWER9 +
core count4 +, 8 +, 12 +, 16 +, 20 + and 24 +
designerIBM +
first launchedAugust 2017 +
full page nameibm/microarchitectures/power9 +
instance ofmicroarchitecture +
instruction set architecturePower ISA v3.0B +
manufacturerGlobalFoundries +
microarchitecture typeCPU +
namePOWER9 +
phase-out2020 +
pipeline stages (max)16 +
pipeline stages (min)12 +
process14 nm (0.014 μm, 1.4e-5 mm) +