From WikiChip
Editing amd/microarchitectures/zen
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone.
Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
This page supports semantic in-text annotations (e.g. "[[Is specified as::World Heritage Site]]") to build structured and queryable content provided by Semantic MediaWiki. For a comprehensive description on how to use annotations or the #ask parser function, please have a look at the getting started, in-text annotation, or inline queries help pages.
Latest revision | Your text | ||
Line 462: | Line 462: | ||
Generally, the four ALUs will execute four integer instructions per cycle. Simple operations can be done by any of the ALUs whereas the more expensive multiplication and division ones can only be done by their respective ALUs (there is one of each). Additionally, two of Zen's [[ALU]]s are capable of performing a branch, therefore Zen can peak at 2 branches per cycle. This only occurs if they are not taken. The two branches can simultaneously execute two branch instructions from the same thread or from two separate threads. If the branch is taken, Zen is restricted to only 1 branch per cycle. This is a similar restriction which is found in Intel's architectures such as {{intel|Haswell|l=arch}}. In {{intel|Haswell|l=arch}}, port 0 can only execute predicted "not-taken" branches whereas port 6 can perform both "taken" and "not taken". AMD's reason for adding a second branch is driven by an entirely different reason compared to Haswell which had done the same. The second branch unit in {{intel|Haswell|l=arch}} was added largely in an effort to mitigate port contention. Prior to that change, code involving tight loops that performed SSE operations ended up fighting over the same port as both the SSE operation and the actual branch ended up being scheduled on the same port. Zen doesn't actually have this issue. The addition of a second branch unit in their case serves to purely boost the performance of branch-heavy code. | Generally, the four ALUs will execute four integer instructions per cycle. Simple operations can be done by any of the ALUs whereas the more expensive multiplication and division ones can only be done by their respective ALUs (there is one of each). Additionally, two of Zen's [[ALU]]s are capable of performing a branch, therefore Zen can peak at 2 branches per cycle. This only occurs if they are not taken. The two branches can simultaneously execute two branch instructions from the same thread or from two separate threads. If the branch is taken, Zen is restricted to only 1 branch per cycle. This is a similar restriction which is found in Intel's architectures such as {{intel|Haswell|l=arch}}. In {{intel|Haswell|l=arch}}, port 0 can only execute predicted "not-taken" branches whereas port 6 can perform both "taken" and "not taken". AMD's reason for adding a second branch is driven by an entirely different reason compared to Haswell which had done the same. The second branch unit in {{intel|Haswell|l=arch}} was added largely in an effort to mitigate port contention. Prior to that change, code involving tight loops that performed SSE operations ended up fighting over the same port as both the SSE operation and the actual branch ended up being scheduled on the same port. Zen doesn't actually have this issue. The addition of a second branch unit in their case serves to purely boost the performance of branch-heavy code. | ||
− | The 2 [[AGU]]s can be used in conjunction with the ALUs. µOPs involving a memory operands will make use of both at the same time and will not be (i.e., the operations don't get split up). Zen is capable of a read+write or read+read operations in one cycle | + | The 2 [[AGU]]s can be used in conjunction with the ALUs. µOPs involving a memory operands will make use of both at the same time and will not be (i.e., the operations don't get split up). Zen is capable of a read+write or read+read operations in one cycle. |
===== Floating Point ===== | ===== Floating Point ===== |
Facts about "Zen - Microarchitectures - AMD"
codename | Zen + |
core count | 4 +, 6 +, 8 +, 16 +, 24 +, 32 + and 12 + |
designer | AMD + |
first launched | March 2, 2017 + |
full page name | amd/microarchitectures/zen + |
instance of | microarchitecture + |
instruction set architecture | x86-64 + |
manufacturer | GlobalFoundries + |
microarchitecture type | CPU + |
name | Zen + |
pipeline stages | 19 + |
process | 14 nm (0.014 μm, 1.4e-5 mm) + |