From WikiChip
Editing amd/microarchitectures/zen 3

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

This page supports semantic in-text annotations (e.g. "[[Is specified as::World Heritage Site]]") to build structured and queryable content provided by Semantic MediaWiki. For a comprehensive description on how to use annotations or the #ask parser function, please have a look at the getting started, in-text annotation, or inline queries help pages.

Latest revision Your text
Line 137: Line 137:
 
** Higher [[IPC]] (AMD self-reported +19% IPC)
 
** Higher [[IPC]] (AMD self-reported +19% IPC)
 
** Front-end
 
** Front-end
*** Increased branch prediction bandwidth
+
** Increased branch prediction bandwidth
 
*** "zero-bubble" branch prediction
 
*** "zero-bubble" branch prediction
 
*** L1 BTB doubled from 512 to 1024 entries
 
*** L1 BTB doubled from 512 to 1024 entries
*** Improved prefetching
+
** Improved prefetching
*** Improved µop cache
+
** Improved µop cache
** Back-end
+
* Back-end
*** Floating point unit:
+
** Floating point unit:
**** FMA latency reduced by 1 cycle from 5 to 4.
+
*** FMA latency reduced by 1 cycle from 5 to 4.
**** Fifth and sixth dedicated execution ports added for floating point store and FP-to-int transfer, no longer sharing 2nd FADD port.
+
*** Fifth and sixth dedicated execution ports added for floating point store and FP-to-int transfer, no longer sharing 2nd FADD port.
**** Unified scheduler split into 1 scheduler per FMA/FADD/transfer port set.
+
*** Unified scheduler split into 1 scheduler per FMA/FADD/transfer port set.
**** 256b VAES and VPCLMULDQ support for doubled AES and AES-GCM cryptographic throughput.
+
*** 256b VAES and VPCLMULDQ support for doubled AES and AES-GCM cryptographic throughput.
**** Hardware implementation of BMI2 PDEP/PEXT bit scatter/gather operations, compared to prior microcode emulation.
+
*** Hardware implementation of BMI2 PDEP/PEXT bit scatter/gather operations, compared to prior microcode emulation.
*** Integer unit:
+
** Integer unit:
**** Integer physical register file increased from 180 to 192 entries
+
*** Integer physical register file increased from 180 to 192 entries
**** Issue increased from 7 (existing 4 ALU and 3 AGU) to 10 with 1 new dedicated branch execution port and 2 separated store data pathways.
+
*** Issue increased from 7 (existing 4 ALU and 3 AGU) to 10 with 1 new dedicated branch execution port and 2 separated store data pathways.
**** Schedulers shared between pairs of ALU + AGU/branch ports instead of dedicated for each.
+
*** Schedulers shared between pairs of ALU + AGU/branch ports instead of dedicated for each.
**** Instruction redundancy increased between ports for reduced bottlenecking on a wider variety of instruction streams.
+
*** Instruction redundancy increased between ports for reduced bottlenecking on a wider variety of instruction streams.
**** 8/16/32/64 bit signed integer division/modulo latency improved from 17/22/30/46 cycles to 10/12/14/20. (Unsigned operations are ~1 cycle faster for some of both old/new cases.) Throughput improves proportionately.
+
*** 8/16/32/64 bit signed integer division/modulo latency improved from 17/22/30/46 cycles to 10/12/14/20. (Unsigned operations are ~1 cycle faster for some of both old/new cases.) Throughput improves proportionately.
*** Load/store:
+
** Load/store:
**** Load throughput increased from 2 to 3, if not 256b.
+
*** Load throughput increased from 2 to 3, if not 256b.
**** Store throughput increased from 1 to 2, if not 256b.
+
*** Store throughput increased from 1 to 2, if not 256b.
**** Store queue increase from 48 to 64 slots.
+
*** Store queue increase from 48 to 64 slots.
**** Page table walkers tripled from 2 to 6 for TLB miss handling.
+
*** Page table walkers tripled from 2 to 6 for TLB miss handling.
 
{{expand list}}
 
{{expand list}}
  

Please note that all contributions to WikiChip may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see WikiChip:Copyrights for details). Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)
codenameZen 3 +
core count64 +, 56 +, 48 +, 32 +, 28 +, 24 +, 16 +, 12 +, 8 + and 6 +
designerAMD +
first launchedOctober 8, 2020 +
full page nameamd/microarchitectures/zen 3 +
instance ofmicroarchitecture +
instruction set architecturex86-64 +
manufacturerTSMC + and GlobalFoundries +
microarchitecture typeCPU +
nameZen 3 +
pipeline stages19 +