From WikiChip
Editing amd/microarchitectures/zen 3

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

This page supports semantic in-text annotations (e.g. "[[Is specified as::World Heritage Site]]") to build structured and queryable content provided by Semantic MediaWiki. For a comprehensive description on how to use annotations or the #ask parser function, please have a look at the getting started, in-text annotation, or inline queries help pages.

Latest revision Your text
Line 7: Line 7:
 
|manufacturer 2=GlobalFoundries
 
|manufacturer 2=GlobalFoundries
 
|introduction=October 8, 2020
 
|introduction=October 8, 2020
|process=7nm, 12nm
+
|process=7nm
|cores=64
+
|process 2=12 nm
|cores 2=56
 
|cores 3=48
 
|cores 4=32
 
|cores 5=28
 
|cores 6=24
 
|cores 7=16
 
|cores 8=12
 
|cores 9=8
 
|cores 10=6
 
 
|type=Superscalar
 
|type=Superscalar
 
|oooe=Yes
 
|oooe=Yes
Line 53: Line 44:
 
|extension 27=UMIP
 
|extension 27=UMIP
 
|extension 28=CLZERO
 
|extension 28=CLZERO
|extension 29=VAES
 
|extension 30=VPCLMUL
 
 
|predecessor=Zen 2
 
|predecessor=Zen 2
 
|predecessor link=amd/microarchitectures/zen 2
 
|predecessor link=amd/microarchitectures/zen 2
Line 67: Line 56:
  
 
== Codenames ==
 
== Codenames ==
[[File:amd zen2-3 roadmap.png|thumb|right|Zen 3 on the roadmap]]
+
[[File:amd zen2-3 roadmap.png|400px|right]]
 +
{{future information}}
  
'''Product Codenames:'''
 
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-
Line 76: Line 65:
 
| {{amd|Milan|l=core}} || Up to 64/128 || High-end server [[multiprocessors]]
 
| {{amd|Milan|l=core}} || Up to 64/128 || High-end server [[multiprocessors]]
 
|-
 
|-
| {{amd|Chagall|l=core}} || Up to 64/128 || Workstation & enthusiasts market processors
+
| {{amd|Genesis Peak|l=core}} || ?/? || Workstation & enthusiasts market processors
 
|-
 
|-
 
| {{amd|Vermeer|l=core}} || Up to 16/32 || Mainstream to high-end desktops & enthusiasts market processors
 
| {{amd|Vermeer|l=core}} || Up to 16/32 || Mainstream to high-end desktops & enthusiasts market processors
 
|-
 
|-
| {{amd|Cezanne|l=core}} || Up to 8/16 || Mainstream APUs with GPUs
+
| {{amd|Cezanne|l=core}} || Up to 8/16 || Mainstream desktop & mobile processors with GPU  
|}
 
 
 
'''Architectural Codenames:'''
 
{| class="wikitable"
 
|-
 
! Arch !! Codename
 
|-
 
| Core || Cerebrus
 
|-
 
| CCD || Breckenridge
 
|}
 
 
 
== Products ==
 
{{future information}}
 
 
 
{| class="wikitable"
 
|-
 
! Processor Series !! Cores/Threads !! Market
 
|-
 
| EPYC 7003 "{{amd|Milan|l=core}}" || Up to 64/128 || High-end server [[multiprocessors]]
 
|-
 
| {{amd|Trento|l=core}}<!--s/a Milan page--> || ?/? || High-performance computing
 
|-
 
| Ryzen Threadripper 5900 "{{amd|Chagall|l=core}}" || Up to 64/128 || Workstation processors
 
|-
 
| Ryzen 5000 "{{amd|Vermeer|l=core}}" || Up to 16/32 || Mainstream to high-end desktops & enthusiasts market processors
 
|-
 
| Ryzen 5000 APU "{{amd|Cezanne|l=core}}" || Up to 8/16 || Mainstream desktop & mobile processors with integrated GPU
 
 
|}
 
|}
  
 
== Process technology ==
 
== Process technology ==
Zen 3 is fabricated on [[TSMC]]'s [[7 nm process|7nm+ process]] for the Core Compute Die (CCD), the same process used in Zen 2 Refresh processors, as well as [[GlobalFoundries]] [[14 nm process|12nm process]] for the Input/Output Die (IOD).  
+
Zen 3 will be fabricated on [[TSMC]]'s [[7 nm process|7nm+ process]], the same process used in Zen 2 Refresh processors.  
  
Note: Only the APU series of microprocessors retains the monolithic design, so they are fabricated solely on [[TSMC]]'s [[7 nm process|7nm+ process]].
+
== Architecture ==
  
== Compiler support ==
+
There is very limited information available about the architectural improvements of Zen 3.
{| class="wikitable"
 
|-
 
! Compiler !! Arch-Specific || Arch-Favorable
 
|-
 
| [[GCC]] || <code>-march=znver3</code> || <code>-mtune=znver3</code>
 
|-
 
| [[LLVM]] || <code>-march=znver3</code> || <code>-mtune=znver3</code>
 
|}
 
* '''Note:''' Initial support in GCC 10.3 and LLVM 12.0.
 
  
== Architecture ==
+
=== Key changes from {{\\|Zen 2}} ===
  
=== Key changes from {{\\|Zen 2}} ===
+
* +19% IPC
* CCD
+
* Unified 8-core CCX with 32 MiB L3$ available to all 8 cores equally. Latency increased by roughly 7 cycles (18%) to an average of 46 cycles.
** Unified 8-core CCX (from 2x 4-Core CCX per CCD)
+
* Integer unit:
** 32 MiB L3$ available equally to all cores in CCD.
+
** Integer physical register file increased from 180 to 192 entries
*** Increased L3 latency (~46 cycles, up from ~40 cycles)
+
** Issue increased from 7 (existing 4 ALU and 3 AGU) to 10 with 1 new dedicated branch execution port and 2 separated store data pathways.
* Core
+
** Schedulers shared between pairs of ALU + AGU/branch ports instead of dedicated for each.
** Higher [[IPC]] (AMD self-reported +19% IPC)
+
** Instruction redundancy increased between ports for reduced bottlenecking on a wider variety of instruction streams.
** Front-end
+
** 8/16/32/64 bit signed integer division/modulo latency improved from 17/22/30/46 cycles to 10/12/14/20. (Unsigned operations are ~1 cycle faster for some of both old/new cases.) Throughput improves proportionately.
*** Increased branch prediction bandwidth
+
* Floating point unit:
*** "zero-bubble" branch prediction
+
** FMA latency reduced by 1 cycle from 5 to 4.
*** L1 BTB doubled from 512 to 1024 entries
+
** Fifth and sixth dedicated execution ports added for floating point store and FP-to-int transfer, no longer sharing 2nd FADD port.
*** Improved prefetching
+
** Unified scheduler split into 1 scheduler per FMA/FADD/transfer port set.
*** Improved µop cache
+
** 256b VAES and VPCLMULDQ support for doubled AES and AES-GCM cryptographic throughout.
** Back-end
+
** Hardware implementation of BMI2 PDEP/PEXT bit scatter/gather operations, compared to prior microcode emulation.
*** Floating point unit:
+
* Load/store:
**** FMA latency reduced by 1 cycle from 5 to 4.
+
** Load throughput increased from 2 to 3, if not 256b.
**** Fifth and sixth dedicated execution ports added for floating point store and FP-to-int transfer, no longer sharing 2nd FADD port.
+
** Store throughput increased from 1 to 2, if not 256b.
**** Unified scheduler split into 1 scheduler per FMA/FADD/transfer port set.
+
** Store queue increase from 48 to 64 slots.
**** 256b VAES and VPCLMULDQ support for doubled AES and AES-GCM cryptographic throughput.
+
** Page table walkers tripled from 2 to 6 for TLB miss handling.
**** Hardware implementation of BMI2 PDEP/PEXT bit scatter/gather operations, compared to prior microcode emulation.
+
* Improved prefetching
*** Integer unit:
+
* Increased branch prediction bandwidth
**** Integer physical register file increased from 180 to 192 entries
+
** "zero-bubble" branch prediction
**** Issue increased from 7 (existing 4 ALU and 3 AGU) to 10 with 1 new dedicated branch execution port and 2 separated store data pathways.
+
** L1 BTB doubled from 512 to 1024 entries
**** Schedulers shared between pairs of ALU + AGU/branch ports instead of dedicated for each.
+
* Improved µop cache
**** Instruction redundancy increased between ports for reduced bottlenecking on a wider variety of instruction streams.
 
**** 8/16/32/64 bit signed integer division/modulo latency improved from 17/22/30/46 cycles to 10/12/14/20. (Unsigned operations are ~1 cycle faster for some of both old/new cases.) Throughput improves proportionately.
 
*** Load/store:
 
**** Load throughput increased from 2 to 3, if not 256b.
 
**** Store throughput increased from 1 to 2, if not 256b.
 
**** Store queue increase from 48 to 64 slots.
 
**** Page table walkers tripled from 2 to 6 for TLB miss handling.
 
 
{{expand list}}
 
{{expand list}}
 
=== New Instructions ===
 
Zen 3 introduced the following ISA enhancements:
 
 
* {{x86|VAES}} - 256-bit Vector AES instructions
 
** <code>VAESDEC</code> - AES Decryption Round
 
** <code>VAESDECLAST</code> - AES Last Decryption Round
 
** <code>VAESENC</code> - AES Encryption Round
 
** <code>VAESENCLAST</code> - AES Last Encryption Round
 
* <code>{{x86|VPCLMULQDQ}}</code> - 256-bit Vector Carry-Less Multiplication of Quadwords
 
* {{x86|PCID}} - Process Context Identifiers
 
** <code>{{x86|INVPCID}}</code> - Invalidate TLB entry(s) in a specified PCID
 
* {{x86|INVLPGB}} - Broadcast TLB flushing
 
** <code>INVLPGB</code> - Invalidate TLB entry(s) with broadcast to all processors
 
** <code>TLBSYNC</code> - Synchronize TLB invalidations
 
* {{x86|PKU}} - Memory Protection Keys for Users
 
** <code>RDPKRU</code> - Read Protection Key Rights
 
** <code>WRPKRU</code> - Write Protection Key Rights
 
* {{x86|CET|CET_SS}} - Control-flow Enforcement Technology / Shadow Stack
 
** <code>CLRSSBSY</code>, <code>INCSSP</code>, <code>RDSSP</code>, <code>RSTORSSP</code>, <code>SAVEPREVSSP</code>, <code>SETSSBSY</code>, <code>WRSS</code>, <code>WRUSS</code>
 
* {{x86|SME|SEV-SNP}} - 3rd generation Secure Encrypted Virtualization - Secure Nested Paging
 
** <code>PSMASH</code>, <code>PVALIDATE</code>, <code>RMPADJUST</code>, <code>RMPUPDATE</code>
 
* {{x86|PSFD}} - Predictive Store Forwarding Disable (Speculation Control MSR)<ref name="amd-psf"/>
 
 
Sources: <ref name="amd-24593-apm2"/><ref name="amd-24594-apm3"/><ref name="amd-26568-apm4"/>
 
 
=== Memory Hierarchy ===
 
==== Data and Instruction Caches ====
 
* L0 Op Cache:
 
** 4,096 Ops per core, 8-way set associative
 
** 8 Op line size
 
** Parity protected
 
* L1I Cache:
 
** 32 KiB per core, 8-way set associative
 
** 64 B line size
 
** Parity protected
 
* L1D Cache:
 
** 32 KiB per core, 8-way set associative
 
** 64 B line size
 
** Write-back policy
 
** 4-5 cycles latency for Int
 
** 7-8 cycles latency for FP
 
** ECC
 
* L2 Cache:
 
** 512 KiB per core, 8-way set associative
 
** 64 B line size
 
** Write-back policy
 
** Inclusive of L1
 
** ≥ 12 cycles latency
 
** {{abbr|DEC-TED}} ECC, tag & state arrays {{abbr|SEC-DED}}<!--7 check bits for 42 tag bits; AMD-55898-0.50 Sec 3.5-->
 
* L3 Cache:
 
** "{{amd|Milan|l=core}}" & "{{amd|Chagall|l=core}}": 32 MiB/CCX, up to 256 MiB total
 
** "{{amd|Vermeer|l=core}}": 32 MiB/CCX, up to 64 MiB total
 
** "{{amd|Cezanne|l=core}}": 16 MiB, 8 MiB usable on some SKUs
 
** Shared by all cores in the {{abbr|CCX}}, configurable<ref name="amd-56375-qos"/>
 
** 16-way set associative
 
** 64 B line size
 
** L2 [[victim cache]]
 
** Write-back policy
 
** 46 cycles average load-to-use latency
 
** DEC-TED ECC, tag array & shadow tags SEC-DED<!--AMD-55898-0.50 Sec 3.5-->
 
** QoS Monitoring and Enforcement V2.0
 
 
==== Translation Lookaside Buffers ====
 
* ITLB
 
** 64 entry L1 TLB, fully associative, all page sizes
 
** 512 entry L2 TLB, 8-way set associative
 
*** 4-Kbyte and 2-Mbyte pages
 
** Parity protected
 
* DTLB
 
** 64 entry L1 TLB, fully associative, all page sizes
 
** 2,048 entry L2 TLB, 16-way set associative
 
*** 4-Kbyte and 2-Mbyte pages, PDEs to speed up table walks
 
** Parity protected
 
 
All caches and TLBs are competitively shared in multi-threaded mode.
 
 
==== System DRAM ====
 
* EPYC 7003 "{{amd|Milan|l=core}}":
 
** 8 channels per socket, up to 16 DIMMs, max. 4&nbsp;TiB
 
** Up to PC4-25600L (DDR4-3200)
 
** {{abbr|SR}}/{{abbr|DR}} {{abbr|RDIMM}}, {{abbr|4R}}/{{abbr|8R}} {{abbr|LRDIMM}}, {{abbr|3DS DIMM}}, {{abbr|NVDIMM-N}}
 
** ECC supported (x4, x8, x16, chipkill)<!--AMD-55898-0.50 Sec 3.7-->
 
** DRAM bus parity and write data CRC options<!--ibid-->
 
 
* Ryzen Threadripper 5900 "{{amd|Chagall|l=core}}":
 
** 8 channels, up to 8 DIMMs, max. 2&nbsp;TiB
 
** Up to PC4-25600L (DDR4-3200)
 
** SR/DR {{abbr|UDIMM}}, RDIMM, LRDIMM, 3DS DIMM
 
** ECC supported
 
 
* Ryzen 5000 "{{amd|Vermeer|l=core}}":
 
** 2 channels, up to 4 DIMMs, max. 128&nbsp;GiB
 
** Up to PC4-25600U (DDR4-3200 UDIMM), ECC supported
 
 
* Ryzen 5000 APU "{{amd|Cezanne|l=core}}":
 
** {{amd|Socket AM4|l=pack}}:
 
*** 2 channels, up to 4 DIMMs, max. 128&nbsp;GiB
 
*** Up to PC4-25600U (DDR4-3200 UDIMM), ECC supported ("PRO" models)
 
** {{amd|FP6|FP6 package|l=pack}}, DDR4 mode:
 
*** 2 × 64-bit channels, up to 2 DIMMs, max. 64&nbsp;GiB
 
*** Up to PC4-25600U (DDR4-3200 UDIMM), ECC supported(?)
 
** FP6 package, LPDDR4 mode:
 
*** 4 × 32-bit channels, max. 32&nbsp;GiB
 
*** Up to LPDDR4X-4266
 
 
Sources: <ref name="amd-56375-qos"/><ref name="amd-56665-sog-19h"/><ref name="amd-55898-ppr-1901-0.35"/><ref name="amd-55898-ppr-1901-0.50"/><ref name="amd-56178-mdg-fp6"/>
 
  
 
== All Zen 3 Chips ==
 
== All Zen 3 Chips ==
  
<!-- NOTE:
+
<!-- NOTE:  
 
           This table is generated automatically from the data in the actual articles.
 
           This table is generated automatically from the data in the actual articles.
 
           If a microprocessor is missing from the list, an appropriate article for it needs to be
 
           If a microprocessor is missing from the list, an appropriate article for it needs to be
Line 347: Line 185:
 
* AMD 'Tech Day', February 22, 2017
 
* AMD 'Tech Day', February 22, 2017
 
* AMD 2017 Financial Analyst Day, May 16, 2017
 
* AMD 2017 Financial Analyst Day, May 16, 2017
 
== References ==
 
<references>
 
<ref name="amd-psf">{{cite techdoc|title=White Paper: Security Analysis of AMD Predictive Store Forwarding|url=https://www.amd.com/system/files/documents/security-analysis-predictive-store-forwarding.pdf|publ=AMD|date=2021-03}}</ref>
 
<ref name="amd-24593-apm2">{{cite techdoc|title=AMD64 Architecture Programmer’s Manual Volume 2: System Programming|url=https://www.amd.com/system/files/TechDocs/24593.pdf|publ=AMD|pid=24593|rev=3.37|date=2021-03}}</ref>
 
<ref name="amd-24594-apm3">{{cite techdoc|title=AMD64 Architecture Programmer’s Manual Volume 3: General-Purpose and System Instructions|url=https://www.amd.com/system/files/TechDocs/24594.pdf|publ=AMD|pid=24594|rev=3.32|date=2021-03}}</ref>
 
<ref name="amd-26568-apm4">{{cite techdoc|title=AMD64 Architecture Programmer’s Manual Volume 4: 128-Bit and 256-Bit Media Instructions|url=https://www.amd.com/system/files/TechDocs/26568.pdf|publ=AMD|pid=26568|rev=3.24|date=2020-05}}</ref>
 
<ref name="amd-56375-qos">{{cite techdoc|title=AMD64 Technology Platform Quality of Service Extensions|url=https://developer.amd.com/wp-content/resources/56375.pdf|publ=AMD|pid=56375|rev=1.02|date=2020-10}}</ref>
 
<ref name="amd-56665-sog-19h">{{cite techdoc|title=Software Optimization Guide for AMD Family 19h Processors (PUB)|url=https://www.amd.com/system/files/TechDocs/56665.zip|publ=AMD|pid=56665|rev=3.00|date=2020-11}}</ref>
 
<ref name="amd-55898-ppr-1901-0.35">{{cite techdoc|title=Preliminary Processor Programming Reference (PPR) for AMD Family 19h Model 01h, Revision B1 Processors|url=https://www.amd.com/system/files/TechDocs/55898_pub.zip|publ=AMD|pid=55898|rev=0.35|date=2021-02-05}}</ref>
 
<ref name="amd-55898-ppr-1901-0.50">{{cite techdoc|title=Preliminary Processor Programming Reference (PPR) for AMD Family 19h Model 01h, Revision B1 Processors|publ=AMD|pid=55898|rev=0.50|date=2021-05-27}}</ref>
 
<ref name="amd-56178-mdg-fp6">{{cite techdoc|title=FP6 Processor Motherboard Design Guide|publ=AMD|pid=56178|rev=1.03|date=2020-01}}</ref>
 
</references>
 
  
 
== See Also ==
 
== See Also ==
* AMD {{\\|Zen}}, {{\\|Zen 2}}, {{\\|Zen 4}}
+
* AMD {{\\|Zen}}
 
* Intel {{intel|Tigerlake|l=arch}}
 
* Intel {{intel|Tigerlake|l=arch}}
 
* Read also: [https://www.anandtech.com/print/16214/amd-zen-3-ryzen-deep-dive-review-5950x-5900x-5800x-and-5700x-tested AMD Zen 3 Ryzen Deep Dive Review]
 
* Read also: [https://www.anandtech.com/print/16214/amd-zen-3-ryzen-deep-dive-review-5950x-5900x-5800x-and-5700x-tested AMD Zen 3 Ryzen Deep Dive Review]
* Read here: [https://techmotherboard.com/best-zen-3-cpu/ AMD Zen 3 Reviews]
 

Please note that all contributions to WikiChip may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see WikiChip:Copyrights for details). Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)
codenameZen 3 +
core count64 +, 56 +, 48 +, 32 +, 28 +, 24 +, 16 +, 12 +, 8 + and 6 +
designerAMD +
first launchedOctober 8, 2020 +
full page nameamd/microarchitectures/zen 3 +
instance ofmicroarchitecture +
instruction set architecturex86-64 +
manufacturerTSMC + and GlobalFoundries +
microarchitecture typeCPU +
nameZen 3 +
pipeline stages19 +