From WikiChip
Editing intel/microarchitectures/bonnell

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

This page supports semantic in-text annotations (e.g. "[[Is specified as::World Heritage Site]]") to build structured and queryable content provided by Semantic MediaWiki. For a comprehensive description on how to use annotations or the #ask parser function, please have a look at the getting started, in-text annotation, or inline queries help pages.

Latest revision Your text
Line 284: Line 284:
  
  
Unlike {{\\|P5}}, which only had 5 stages, Bonnell has 16 to 19 pipeline stages. The longer pipeline allows a more even spreading of heat across the chip with more units. This also allows a higher clock rate.  
+
Unlike {{\\|P5}}, which only had 5 stages, Bonnell has 16 to 19 stages pipeline. The longer pipeline allows a more evenly spreading of heat across the chip with more units. This also allows a higher clock rate.  
  
 
==== Front End ====
 
==== Front End ====
Bonnell's front end is very simple when compared to Intel's high-performance architectures. [[Out-of-order execution]] (OoOE) that is found ubiquitously in all HPC architectures was rejected. Bonnell's power and area constraints simply couldn't allow for the complex logic needed to support that capability. The [[Instruction Fetch]] consists of 3 stages, capable of going through up to 16 bytes per cycle. Like fetch, the [[Instruction Decode]] is also 3 stages, capable of decording instructions with up to 3 prefixes each cycle (considerably longer for more complex instructions).
+
Bonnell's front end is very simple when compared to Intel's high-performance architectures. [[Out-of-order execution]] (OoOE) that is found ubiquitously in all HPC architectures was rejected. Bonnell's power and area constraints simply couldn't allow for the complex logic needed to support that capability. The [[Instruction Fetch]] consists of 3 stages capable going through up to 16 bytes per cycle. Like fetch, the [[Instruction Decode]] is also 3 stages capable of decording instructions with up to 3 prefixes each cycle (considerably longer for more complex instructions).
  
 
Bonnell is a departure from all modern x86 architectures with respect to decoding (including those developed by [[AMD]] and [[VIA]] and every Intel architecture since {{\\|P6}}). Whereas modern architectures transform complex [[x86]] instructions into a more easily digestible µop form, Bonnell does almost no such transformations. The pipeline is tailored to execute regular x86 instructions as single atomic operations consisting of a single destination register and up to three source-registers (typical load-operate-store format). Most instructions actually correspond very closely to the original x86 instructions. This design choice results in lower complexity but at the cost of performance reduction. Bonnell has two identical decoders capable of decoding complex x86 instructions. Being variable length instruction architecture introduces an additional layer of complexity. To assist the decoders, Bonnell implements predecoders that determine instruction boundaries and mark them using a single-bit marker. Two cycles are allocated for predecoding as well as L1 storage. Boundary marks are also stored in the L1 eliminating the need to preform needlessly redundant predecoding. Repeated operations are retrieved pre-marked eliminating two cycles. Bonnel has a 36 KiB L1 instruction cache consisting of 32 KiB instruction cache and 4 KiB instruction boundary mark cache. All instructions (coming from both cache or predecode) must undergo full decode. It's worthwhile noting that Intel states Bonnell is a 16-stage pipeline because for the most part, after a cache hit you'll have 16 stages. This is also true in some cases where the processor can simultaneously decode the next instruction. However, in the cases where you get a miss, it will cost 3 additional stages to catch up and locate the boundary for that instruction for a total of 19 stages.
 
Bonnell is a departure from all modern x86 architectures with respect to decoding (including those developed by [[AMD]] and [[VIA]] and every Intel architecture since {{\\|P6}}). Whereas modern architectures transform complex [[x86]] instructions into a more easily digestible µop form, Bonnell does almost no such transformations. The pipeline is tailored to execute regular x86 instructions as single atomic operations consisting of a single destination register and up to three source-registers (typical load-operate-store format). Most instructions actually correspond very closely to the original x86 instructions. This design choice results in lower complexity but at the cost of performance reduction. Bonnell has two identical decoders capable of decoding complex x86 instructions. Being variable length instruction architecture introduces an additional layer of complexity. To assist the decoders, Bonnell implements predecoders that determine instruction boundaries and mark them using a single-bit marker. Two cycles are allocated for predecoding as well as L1 storage. Boundary marks are also stored in the L1 eliminating the need to preform needlessly redundant predecoding. Repeated operations are retrieved pre-marked eliminating two cycles. Bonnel has a 36 KiB L1 instruction cache consisting of 32 KiB instruction cache and 4 KiB instruction boundary mark cache. All instructions (coming from both cache or predecode) must undergo full decode. It's worthwhile noting that Intel states Bonnell is a 16-stage pipeline because for the most part, after a cache hit you'll have 16 stages. This is also true in some cases where the processor can simultaneously decode the next instruction. However, in the cases where you get a miss, it will cost 3 additional stages to catch up and locate the boundary for that instruction for a total of 19 stages.

Please note that all contributions to WikiChip may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see WikiChip:Copyrights for details). Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)