Edit Values |
POWER9 µarch |
|
Arch Type | CPU |
Designer | IBM |
Manufacturer | GlobalFoundries |
Introduction | August, 2017 |
Phase-out | August, 2018 |
Process | 14 nm |
Core Configs | 24 |
|
Type | Superscalar |
Speculative | Yes |
Reg Renaming | Yes |
Stages | 12-16 |
|
ISA | Power ISA v3.0 |
|
L1I Cache | 32 KiB/core |
L1D Cache | 32 KiB/core |
L2 Cache | 512 KiB/core |
L3 Cache | 120 MiB/chip |
|
|
POWER9 is IBM's successor to POWER8, a 14 nm microarchitecture for Power-based server microprocessors that is set to be introduced in the 2nd half of 2017. POWER9-based processors are branded under the POWER9 family.
Process Technology
POWER9 is set to be fabricated on GlobalFoundries' 14 nm FinFET process, the same process that's used by AMD for their Zen microarchitecture.
Compatibility
Initial support for POWER9 started with Linux Kernel 4.8.
Vendor |
OS |
Version |
Notes
|
IBM |
AIX |
7.? |
Support
|
IBM i |
? |
Support
|
Linux |
Linux |
Kernel 4.8 |
Initial Support
|
Wind River |
VxWorks |
VxWorks 7.? |
Support
|
Compiler support
Compiler |
CPU |
Arch-Favorable
|
GCC |
-mcpu=pwr9 |
-mtune=pwr9
|
LLVM |
-mcpu=pwr9 |
-mtune=pwr9
|
XL C/C++ |
-mcpu=pwr9 |
-mtune=pwr9
|
Variations
IBM offers POWER9 in two flavors: Scale-Out (SO) and Scale-Up (SU). The Scale-Out variations are design for traditional datacenter clusters utilizing single- and -dual sockets setups. The Scale-Up variations are designed for NUMA servers with four sockets and up, supporting large memory and throughput.
For both the Scale-Out and the Scale-Up there are two variations, a 12-core SMT8 model and a 24-core SMT4 model. The SMT4 is optimized for Linux Ecosystem whereas the SMT8 is said to be optimized for the PowerVM Ecosystem community (AIX / IBM i customers). Those models support up to 8 channels of DDR4 memory for up to 128 GiB of memory.
|
Linux Ecosystem |
PowerVM Ecosystem
|
|
24-core / 96 Threads |
12-core / 96 Threads
|
Scale-Out (SO)
|
|
|
Scale-Up (SU)
|
|
|
Architecture
Key changes from POWER8
- 14 nm process (from 22 nm)
- 17-layer metal stack
- 8,000,000,000 transistors
- Support for Power ISA v3.0
- Higher single-thread performance
- New highly modular architecture
- Shorter pipeline
- 5 stages eliminated from fetch to compute vs POWER8
- Cache
- 120 MiB NUCA L3
- eDRAM
- 7 TB/s on-chip bandwidth
- Hardware Acceleration
- I/O Subsystem
- PCIe Gen4
- Local SMP - 16 GT/s per lane interface
- Remote SMP - 25 GT/s per lane interface
- 48-96 lanes capability
- IBM's SMP connect for their scale-up systems
- Also available for the accelerators
- Virtualization
- QoS assistance
- New Interrupt architecture
- Workload-optimized frequency
- Hardware enforced trusted execution
Pipeline
Die Shot
800px
800px
See also