From WikiChip
Zen 4 - Microarchitectures - AMD
< amd‎ | microarchitectures
Revision as of 18:28, 13 November 2023 by 37.201.198.27 (talk) (6.75k is not 6750, increase from 4096=8*8*2^6 to 6912=9*12*2^6 (=4096*9/8 +50%))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Edit Values
Zen 4 µarch
General Info
Arch TypeCPU
DesignerAMD
ManufacturerTSMC
Process5 nm, 6 nm
Succession

Zen 4 is a microarchitecture developed by AMD as a successor to Zen 3. See press release for details: AMD Launches Ryzen 7000 Series Desktop Processors

History[edit]

Zen 4 on the roadmap.

Zen 4 was first mentioned by Forrest Norrod during AMD's EPYC One Year Anniversary webinar. During the next horizon event which was held on November 6, 2018, AMD stated that Zen 4 was at the design completion phase.

Products[edit]

Symbol version future.svg Preliminary Data! Information presented in this article deal with future products, data, features, and specifications that have yet to be finalized, announced, or released. Information may be incomplete and can change by final release.
Processor Series Cores/Threads Market
EPYC 9004 "Genoa" Up to 96/192 High-end server multiprocessors
Ryzen Threadripper 7000 "Storm Peak" Up to 96/192 Workstation & enthusiasts
Ryzen 7000 "Raphael" Up to 16/32 Mainstream to high-end desktops & enthusiasts
Ryzen 7000 APU "Dragon Range" Up to 16/32 High-end mobile processors with GPU
Ryzen 7000 APU "Phoenix Point" Up to 8/16 Mainstream desktop & mobile processors with GPU

Cores using variant Zen 4 uarch:

Processor Series Cores/Threads Market
EPYC 9004 "Bergamo" Up to 128/256 Cloud multiprocessors (smaller, almost half-size Zen 4c [referred to as “Zen 4D” in leaks] core sacrificing half of the L3 cache.)
EPYC 8004 "Siena" Up to 64/128 Edge-optimized server chips

Architectural Codenames:

Arch Codename
Core Persephone
CCD Durango

Process Technology[edit]

Processors implementing Zen 4 are SoCs configured as a Multi-Chip Module or monolithic chip. MCMs consist of a single I/O die and up to 12 Core Complex Dies attached with full-duplex serial point-to-point links. The IOD contains memory controllers, I/O controllers, microcontrollers for security purposes and power management, and other peripherals. The CCDs communicate with peripherals and each other through the Data and Control Fabrics on the I/O die, and each contain a single Core Complex (CCX). The monolithic chips integrate a subset of the IOD facilities and additional peripherals tailored for their target market, a CCX, and a GPU. A CCX contains 8 CPU cores (fewer may be usable on some models) communicating through a shared L3 cache.

("Bergamo" processors configuration TBD.)

The chips are fabricated by TSMC, CCDs and monolithic chips on a 5 nm node, IODs on a 6 nm node.

Architecture[edit]

Zen 4 is a 64-bit superscalar, out-of-order, 2-way SMT microarchitecture with advanced dynamic branch prediction, 4-way decoding of x86 instructions with a stack optimizer, multiple caches including an Op cache for decoded instructions and prefetchers for code and data, four integer/address and two floating point instruction schedulers, 3-way address generation, 5-way integer execution. 4-way 256-bit wide floating point execution, a speculative, out-of-order load/store unit capable of up to three loads or two stores per cycle with a 48/88-entry load and 64-entry store queue, write-combining, and 5-level paging with four TLBs and six hardware page table walkers.

Key changes from Zen 3[edit]

  • AVX-512 instructions support, 256-bit data path[1]
  • L1 and L2 DTLB size increased from 64 to 72 and 2,048 to 3,072 entries
  • Op cache size increased from 4,096 to 6,912 Ops per core
  • L2 cache doubled from 512 KiB to 1 MiB per core (not all processor models), latency increased from 12 to 14 cycles minimum
  • L3 cache average load-to-use latency increased from 46 to 50 cycles
  • Five-level paging; Max. physical and linear address size raised from 48 to 52 and 57 bits respectively
  • Improved cache load, write and prefetch from/to register (less latency)
  • Higher Transistor Density, due to 5nm process
  • Capable of higher all-core clockspeeds (shown by AMD to reach 5GHz+ on all cores)
  • Larger integer register file (from 192 to 224), floating-point register file (from 160 to 192) and reorder buffer (from 256 to 320 entries)
  • REPE CMPSB (sometimes used to implement string comparison) is significantly sped up, processes more than 32 bytes/cycle when operating on L1 data.
  • BSF, BSR, and BMI1 instructions BLSI, BLSMSK, BLSR, TZCNT have smaller latency of 1 and x2 throughput (4 insn/cycle).
  • Latency and/or throughput of VPERMx, V[P]BROADCASTx, VPMOV{S,Z}Xx instructions improved.
  • Some ALU operations on vector registers increased throughput from 2 to 3 ops/cycle.
  • Some ALU operations on vector registers (VPABSx,VPHADDx,VPHSUBx,VPSLLx,VPSRLx,VPSRAx,VPACKx,VPSIGNx,VMAXx,VMINx) increased latency by 1 cycle.


Package level changes:

  • EPYC 9004 "Genoa": Max. core/thread count 96/192, up from 64/128 on EPYC 7003 "Milan"
  • EPYC "Bergamo": Max. 128 cores but preliminary data shows a slightly altered architecture featuring cores that take up less space
  • Support for DDR5 memory and PCIe Gen 5
  • New sockets AM5 (client), SP5 and SP6 (server), FP7/FP7r2 (mobile)
  • APUs: RDNA2-based iGPU with 2 compute units (128 stream processors)

New Instructions[edit]

Zen 4 introduced the following ISA enhancements:

Memory Hierarchy[edit]

Data and Instruction Caches[edit]

  • L0 Op Cache:
    • Up to 6,912 Ops per core, 12-way set associative
    • 9 Op line size (restrictions apply depending on instruction type)
    • Parity protected
  • L1I Cache:
    • 32 KiB per core, 8-way set associative
    • 64 B line size
    • Parity protected
  • L1D Cache:
    • 32 KiB per core, 8-way set associative
    • 64 B line size
    • Write-back policy
    • 4-5 cycles latency for Int
    • 7-8 cycles latency for FP
    • ECC
  • L2 Cache:
    • 512 KiB or 1 MiB per core (varies by processor model), 8-way set associative
    • 64 B line size
    • Write-back policy
    • Inclusive of L1
    • ≥ 14 cycles latency
    • DEC-TED ECC, tag & state arrays SEC-DED
  • L3 Cache:
    • "Genoa": up to 32 MiB/CCX (8 cores), up to 384 MiB total
    • Shared by all cores in the CCX, configurable
    • 16-way set associative
    • 64 B line size
    • L2 victim cache
    • Write-back policy
    • 50 cycles average load-to-use latency
    • DEC-TED ECC, tag array & shadow tags SEC-DED
    • QoS Monitoring and Enforcement with BMEC, L3RR, L3SBE

Translation Lookaside Buffers[edit]

  • ITLB
    • 64 entry L1 TLB, fully associative
      • 4-Kbyte, 2-Mbyte, 1-Gbyte page sizes
    • 512 entry L2 TLB, 8-way set associative
      • 4-Kbyte, 2-Mbyte, and 4-Mbyte pages
    • Parity protected
  • DTLB
    • 72 entry L1 TLB, fully associative
      • 4-Kbyte, 16-Kbyte, 2-Mbyte, 1-Gbyte page sizes
    • 3,072 entry L2 TLB, 24-way set associative
      • 4-Kbyte, 16-Kbyte, 2-Mbyte, and 4-Mbyte pages, PDEs to speed up table walks
    • Parity protected

4-Mbyte pages require two 2-Mbyte entries in all TLBs. 16-Kbyte page size refers to PTE coalescing of four physically consecutive and 16-Kbyte aligned 4-Kbyte pages. All caches and TLBs are competitively shared in multi-threaded mode.

System DRAM[edit]

  • Ryzen 7000 "Raphael":
    • Up to PC5-41600 (DDR5-5200) without overclocking
  • EPYC 9004 "Genoa":
    • 12 channels per socket, two 40-bit (32 data, 8 ECC) DDR5 subchannels per channel
    • Up to 24 DIMMs, max. 6 TiB
    • Up to PC5-38400 (DDR5-4800)
    • SR/DR RDIMM, 4R/8R LRDIMM, 3DS DIMM
    • ECC supported (x4, x8, x16, chipkill)
    • DRAM bus parity and write data CRC options

Sources: [2][3][4]

All Zen 4 Processors[edit]

List of all Zen 4-based Processors
Model Codename C T L2$ L3$ Frequ. Turbo Turbo 1C Memory TDP Launched Release
Price
OPN
Uniprocessors  
EPYC 9354P Genoa 32 64 32 MiB
32,768 KiB
33,554,432 B
0.0313 GiB
256 MiB
262,144 KiB
268,435,456 B
0.25 GiB
3.25 GHz
3,250 MHz
3,250,000 kHz
3.75 GHz
3,750 MHz
3,750,000 kHz
3.8 GHz
3,800 MHz
3,800,000 kHz
DDR5-4800 280 W
280,000 mW
0.375 hp
0.28 kW
10 November 2022 $ 2,730.00
€ 2,457.00
£ 2,211.30
¥ 282,090.90
(1k)
100-100000805,
100-100000805WOF
EPYC 9454P Genoa 48 96 48 MiB
49,152 KiB
50,331,648 B
0.0469 GiB
256 MiB
262,144 KiB
268,435,456 B
0.25 GiB
2.75 GHz
2,750 MHz
2,750,000 kHz
3.65 GHz
3,650 MHz
3,650,000 kHz
3.8 GHz
3,800 MHz
3,800,000 kHz
DDR5-4800 290 W
290,000 mW
0.389 hp
0.29 kW
10 November 2022 $ 4,598.00
€ 4,138.20
£ 3,724.38
¥ 475,111.34
(1k)
100-100000873,
100-100000873WOF
EPYC 9554P Genoa 64 128 64 MiB
65,536 KiB
67,108,864 B
0.0625 GiB
256 MiB
262,144 KiB
268,435,456 B
0.25 GiB
3.1 GHz
3,100 MHz
3,100,000 kHz
3.75 GHz
3,750 MHz
3,750,000 kHz
3.75 GHz
3,750 MHz
3,750,000 kHz
DDR5-4800 360 W
360,000 mW
0.483 hp
0.36 kW
10 November 2022 $ 7,104.00
€ 6,393.60
£ 5,754.24
¥ 734,056.32
(1k)
100-100000804,
100-100000804WOF
EPYC 9654P Genoa 96 192 96 MiB
98,304 KiB
100,663,296 B
0.0938 GiB
384 MiB
393,216 KiB
402,653,184 B
0.375 GiB
2.4 GHz
2,400 MHz
2,400,000 kHz
3.55 GHz
3,550 MHz
3,550,000 kHz
3.7 GHz
3,700 MHz
3,700,000 kHz
DDR5-4800 360 W
360,000 mW
0.483 hp
0.36 kW
10 November 2022 $ 10,625.00
€ 9,562.50
£ 8,606.25
¥ 1,097,881.25
(1k)
100-100000803,
100-100000803WOF
Ryzen 5 7600X Raphael 6 12 6 MiB
6,144 KiB
6,291,456 B
0.00586 GiB
32 MiB
32,768 KiB
33,554,432 B
0.0313 GiB
4.7 GHz
4,700 MHz
4,700,000 kHz
5.3 GHz
5,300 MHz
5,300,000 kHz
105 W
105,000 mW
0.141 hp
0.105 kW
Ryzen 7 7700 Raphael 8 16 8 MiB
8,192 KiB
8,388,608 B
0.00781 GiB
32 MiB
32,768 KiB
33,554,432 B
0.0313 GiB
3.8 GHz
3,800 MHz
3,800,000 kHz
5.3 GHz
5,300 MHz
5,300,000 kHz
65 W
65,000 mW
0.0872 hp
0.065 kW
10 January 2023 $ 339.00
€ 305.10
£ 274.59
¥ 35,028.87
100-000000592,
100-100000592BOX
Ryzen 7 7700X Raphael 8 16 8 MiB
8,192 KiB
8,388,608 B
0.00781 GiB
32 MiB
32,768 KiB
33,554,432 B
0.0313 GiB
4.5 GHz
4,500 MHz
4,500,000 kHz
5.4 GHz
5,400 MHz
5,400,000 kHz
105 W
105,000 mW
0.141 hp
0.105 kW
27 September 2022 $ 399.00
€ 359.10
£ 323.19
¥ 41,228.67
100-000000591,
100-100000591WOF
Ryzen 7 7800X3D Raphael 8 16 8 MiB
8,192 KiB
8,388,608 B
0.00781 GiB
96 MiB
98,304 KiB
100,663,296 B
0.0938 GiB
4.2 GHz
4,200 MHz
4,200,000 kHz
5 GHz
5,000 MHz
5,000,000 kHz
120 W
120,000 mW
0.161 hp
0.12 kW
Ryzen 9 7900X3D Raphael 12 24 12 MiB
12,288 KiB
12,582,912 B
0.0117 GiB
128 MiB
131,072 KiB
134,217,728 B
0.125 GiB
4.4 GHz
4,400 MHz
4,400,000 kHz
5.6 GHz
5,600 MHz
5,600,000 kHz
120 W
120,000 mW
0.161 hp
0.12 kW
Ryzen 9 7950X3D Raphael 16 32 16 MiB
16,384 KiB
16,777,216 B
0.0156 GiB
128 MiB
131,072 KiB
134,217,728 B
0.125 GiB
4.2 GHz
4,200 MHz
4,200,000 kHz
5.7 GHz
5,700 MHz
5,700,000 kHz
120 W
120,000 mW
0.161 hp
0.12 kW
28 February 2023 $ 699.00
€ 629.10
£ 566.19
¥ 72,227.67
100-000000908,
100-000000908WOF
Multiprocessors (dual-socket)  
EPYC 9124 Genoa 16 32 16 MiB
16,384 KiB
16,777,216 B
0.0156 GiB
64 MiB
65,536 KiB
67,108,864 B
0.0625 GiB
3 GHz
3,000 MHz
3,000,000 kHz
3.6 GHz
3,600 MHz
3,600,000 kHz
3.7 GHz
3,700 MHz
3,700,000 kHz
DDR5-4800 200 W
200,000 mW
0.268 hp
0.2 kW
10 November 2022 $ 1,083.00
€ 974.70
£ 877.23
¥ 111,906.39
(1k)
100-100000802,
100-100000802WOF
EPYC 9174F Genoa 16 32 16 MiB
16,384 KiB
16,777,216 B
0.0156 GiB
256 MiB
262,144 KiB
268,435,456 B
0.25 GiB
4.1 GHz
4,100 MHz
4,100,000 kHz
4.15 GHz
4,150 MHz
4,150,000 kHz
4.4 GHz
4,400 MHz
4,400,000 kHz
DDR5-4800 320 W
320,000 mW
0.429 hp
0.32 kW
10 November 2022 $ 3,850.00
€ 3,465.00
£ 3,118.50
¥ 397,820.50
(1k)
100-100000796,
100-100000796WOF
EPYC 9224 Genoa 24 48 24 MiB
24,576 KiB
25,165,824 B
0.0234 GiB
64 MiB
65,536 KiB
67,108,864 B
0.0625 GiB
2.5 GHz
2,500 MHz
2,500,000 kHz
3.65 GHz
3,650 MHz
3,650,000 kHz
3.7 GHz
3,700 MHz
3,700,000 kHz
DDR5-4800 200 W
200,000 mW
0.268 hp
0.2 kW
10 November 2022 $ 1,825.00
€ 1,642.50
£ 1,478.25
¥ 188,577.25
(1k)
100-100000939,
100-100000939WOF
EPYC 9254 Genoa 24 48 24 MiB
24,576 KiB
25,165,824 B
0.0234 GiB
128 MiB
131,072 KiB
134,217,728 B
0.125 GiB
2.9 GHz
2,900 MHz
2,900,000 kHz
3.9 GHz
3,900 MHz
3,900,000 kHz
4.15 GHz
4,150 MHz
4,150,000 kHz
DDR5-4800 200 W
200,000 mW
0.268 hp
0.2 kW
10 November 2022 $ 2,299.00
€ 2,069.10
£ 1,862.19
¥ 237,555.67
(1k)
100-100000480,
100-100000480WOF
EPYC 9274F Genoa 24 48 24 MiB
24,576 KiB
25,165,824 B
0.0234 GiB
256 MiB
262,144 KiB
268,435,456 B
0.25 GiB
4.05 GHz
4,050 MHz
4,050,000 kHz
4.1 GHz
4,100 MHz
4,100,000 kHz
4.3 GHz
4,300 MHz
4,300,000 kHz
DDR5-4800 320 W
320,000 mW
0.429 hp
0.32 kW
10 November 2022 $ 3,060.00
€ 2,754.00
£ 2,478.60
¥ 316,189.80
(1k)
100-100000794,
100-100000794WOF
EPYC 9334 Genoa 32 64 32 MiB
32,768 KiB
33,554,432 B
0.0313 GiB
128 MiB
131,072 KiB
134,217,728 B
0.125 GiB
2.7 GHz
2,700 MHz
2,700,000 kHz
3.85 GHz
3,850 MHz
3,850,000 kHz
3.9 GHz
3,900 MHz
3,900,000 kHz
DDR5-4800 210 W
210,000 mW
0.282 hp
0.21 kW
10 November 2022 $ 2,990.00
€ 2,691.00
£ 2,421.90
¥ 308,956.70
(1k)
100-100000800,
100-100000800WOF
EPYC 9354 Genoa 32 64 32 MiB
32,768 KiB
33,554,432 B
0.0313 GiB
256 MiB
262,144 KiB
268,435,456 B
0.25 GiB
3.25 GHz
3,250 MHz
3,250,000 kHz
3.75 GHz
3,750 MHz
3,750,000 kHz
3.8 GHz
3,800 MHz
3,800,000 kHz
DDR5-4800 280 W
280,000 mW
0.375 hp
0.28 kW
10 November 2022 $ 3,420.00
€ 3,078.00
£ 2,770.20
¥ 353,388.60
(1k)
100-100000798,
100-100000798WOF
EPYC 9374F Genoa 32 64 32 MiB
32,768 KiB
33,554,432 B
0.0313 GiB
256 MiB
262,144 KiB
268,435,456 B
0.25 GiB
3.85 GHz
3,850 MHz
3,850,000 kHz
4.1 GHz
4,100 MHz
4,100,000 kHz
4.3 GHz
4,300 MHz
4,300,000 kHz
DDR5-4800 320 W
320,000 mW
0.429 hp
0.32 kW
10 November 2022 $ 4,850.00
€ 4,365.00
£ 3,928.50
¥ 501,150.50
(1k)
100-100000792,
100-100000792WOF
EPYC 9454 Genoa 48 96 48 MiB
49,152 KiB
50,331,648 B
0.0469 GiB
256 MiB
262,144 KiB
268,435,456 B
0.25 GiB
2.75 GHz
2,750 MHz
2,750,000 kHz
3.65 GHz
3,650 MHz
3,650,000 kHz
3.8 GHz
3,800 MHz
3,800,000 kHz
DDR5-4800 290 W
290,000 mW
0.389 hp
0.29 kW
10 November 2022 $ 5,225.00
€ 4,702.50
£ 4,232.25
¥ 539,899.25
(1k)
100-100000478,
100-100000478WOF
EPYC 9474F Genoa 48 96 48 MiB
49,152 KiB
50,331,648 B
0.0469 GiB
256 MiB
262,144 KiB
268,435,456 B
0.25 GiB
3.6 GHz
3,600 MHz
3,600,000 kHz
3.95 GHz
3,950 MHz
3,950,000 kHz
4.1 GHz
4,100 MHz
4,100,000 kHz
DDR5-4800 360 W
360,000 mW
0.483 hp
0.36 kW
10 November 2022 $ 6,780.00
€ 6,102.00
£ 5,491.80
¥ 700,577.40
(1k)
100-100000788,
100-100000788WOF
EPYC 9534 Genoa 64 128 64 MiB
65,536 KiB
67,108,864 B
0.0625 GiB
256 MiB
262,144 KiB
268,435,456 B
0.25 GiB
2.45 GHz
2,450 MHz
2,450,000 kHz
3.55 GHz
3,550 MHz
3,550,000 kHz
3.7 GHz
3,700 MHz
3,700,000 kHz
DDR5-4800 280 W
280,000 mW
0.375 hp
0.28 kW
10 November 2022 $ 8,803.00
€ 7,922.70
£ 7,130.43
¥ 909,613.99
(1k)
100-100000799,
100-100000799WOF
EPYC 9554 Genoa 64 128 64 MiB
65,536 KiB
67,108,864 B
0.0625 GiB
256 MiB
262,144 KiB
268,435,456 B
0.25 GiB
3.1 GHz
3,100 MHz
3,100,000 kHz
3.75 GHz
3,750 MHz
3,750,000 kHz
3.75 GHz
3,750 MHz
3,750,000 kHz
DDR5-4800 360 W
360,000 mW
0.483 hp
0.36 kW
10 November 2022 $ 9,087.00
€ 8,178.30
£ 7,360.47
¥ 938,959.71
(1k)
100-100000790,
100-100000790WOF
EPYC 9634 Genoa 84 168 84 MiB
86,016 KiB
88,080,384 B
0.082 GiB
384 MiB
393,216 KiB
402,653,184 B
0.375 GiB
2.25 GHz
2,250 MHz
2,250,000 kHz
3.1 GHz
3,100 MHz
3,100,000 kHz
3.7 GHz
3,700 MHz
3,700,000 kHz
DDR5-4800 290 W
290,000 mW
0.389 hp
0.29 kW
10 November 2022 $ 10,304.00
€ 9,273.60
£ 8,346.24
¥ 1,064,712.32
(1k)
100-100000797,
100-100000797WOF
EPYC 9654 Genoa 96 192 96 MiB
98,304 KiB
100,663,296 B
0.0938 GiB
384 MiB
393,216 KiB
402,653,184 B
0.375 GiB
2.4 GHz
2,400 MHz
2,400,000 kHz
3.55 GHz
3,550 MHz
3,550,000 kHz
3.7 GHz
3,700 MHz
3,700,000 kHz
DDR5-4800 360 W
360,000 mW
0.483 hp
0.36 kW
10 November 2022 $ 11,805.00
€ 10,624.50
£ 9,562.05
¥ 1,219,810.65
(1k)
100-100000789,
100-100000789WOF
Count: 24

Designers[edit]

  • Mike Clark(?), chief architect

Bibliography[edit]

References[edit]

  1. "Ryzen 7000 Desktop Preview", Angstronomics, August 29, 2022
  2. "Processor Programming Reference (PPR) for AMD Family 19h Models 11h, Revision B1 Processors", AMD Publ. #55901, Rev. 0.25, November 10, 2022
  3. "Software Optimization Guide for the AMD Zen4 Microarchitecture", AMD Publ. #57647, Rev. 1.00, January 6, 2023
  4. "AMD EPYC™ 9004 Series Architecture Overview", AMD Publ. #58015, Rev. 1.1, December 2022

See Also[edit]

codenameZen 4 +
designerAMD +
full page nameamd/microarchitectures/zen 4 +
instance ofmicroarchitecture +
manufacturerTSMC +
microarchitecture typeCPU +
nameZen 4 +
process5 nm (0.005 μm, 5.0e-6 mm) + and 6 nm (0.006 μm, 6.0e-6 mm) +