From WikiChip
Editing intel/microarchitectures/bonnell

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

This page supports semantic in-text annotations (e.g. "[[Is specified as::World Heritage Site]]") to build structured and queryable content provided by Semantic MediaWiki. For a comprehensive description on how to use annotations or the #ask parser function, please have a look at the getting started, in-text annotation, or inline queries help pages.

Latest revision Your text
Line 291: Line 291:
 
Bonnell is a departure from all modern x86 architectures with respect to decoding (including those developed by [[AMD]] and [[VIA]] and every Intel architecture since {{\\|P6}}). Whereas modern architectures transform complex [[x86]] instructions into a more easily digestible µop form, Bonnell does almost no such transformations. The pipeline is tailored to execute regular x86 instructions as single atomic operations consisting of a single destination register and up to three source-registers (typical load-operate-store format). Most instructions actually correspond very closely to the original x86 instructions. This design choice results in lower complexity but at the cost of performance reduction. Bonnell has two identical decoders capable of decoding complex x86 instructions. Being variable length instruction architecture introduces an additional layer of complexity. To assist the decoders, Bonnell implements predecoders that determine instruction boundaries and mark them using a single-bit marker. Two cycles are allocated for predecoding as well as L1 storage. Boundary marks are also stored in the L1 eliminating the need to preform needlessly redundant predecoding. Repeated operations are retrieved pre-marked eliminating two cycles. Bonnel has a 36 KiB L1 instruction cache consisting of 32 KiB instruction cache and 4 KiB instruction boundary mark cache. All instructions (coming from both cache or predecode) must undergo full decode. It's worthwhile noting that Intel states Bonnell is a 16-stage pipeline because for the most part, after a cache hit you'll have 16 stages. This is also true in some cases where the processor can simultaneously decode the next instruction. However, in the cases where you get a miss, it will cost 3 additional stages to catch up and locate the boundary for that instruction for a total of 19 stages.
 
Bonnell is a departure from all modern x86 architectures with respect to decoding (including those developed by [[AMD]] and [[VIA]] and every Intel architecture since {{\\|P6}}). Whereas modern architectures transform complex [[x86]] instructions into a more easily digestible µop form, Bonnell does almost no such transformations. The pipeline is tailored to execute regular x86 instructions as single atomic operations consisting of a single destination register and up to three source-registers (typical load-operate-store format). Most instructions actually correspond very closely to the original x86 instructions. This design choice results in lower complexity but at the cost of performance reduction. Bonnell has two identical decoders capable of decoding complex x86 instructions. Being variable length instruction architecture introduces an additional layer of complexity. To assist the decoders, Bonnell implements predecoders that determine instruction boundaries and mark them using a single-bit marker. Two cycles are allocated for predecoding as well as L1 storage. Boundary marks are also stored in the L1 eliminating the need to preform needlessly redundant predecoding. Repeated operations are retrieved pre-marked eliminating two cycles. Bonnel has a 36 KiB L1 instruction cache consisting of 32 KiB instruction cache and 4 KiB instruction boundary mark cache. All instructions (coming from both cache or predecode) must undergo full decode. It's worthwhile noting that Intel states Bonnell is a 16-stage pipeline because for the most part, after a cache hit you'll have 16 stages. This is also true in some cases where the processor can simultaneously decode the next instruction. However, in the cases where you get a miss, it will cost 3 additional stages to catch up and locate the boundary for that instruction for a total of 19 stages.
  
Some x86 instructions are simply too complex to handle directly. Those selected few get diverted into the '''micro-code sequencer ROM''' ('''MSROM''') for decoding producing much more sane RISCish instructions at the cost of 2 additional cycles. Intel estimates that only 5% of common software require instructions to be split up. Only decoder0 can request transfer to use the MSROM. All instructions longer than 8 bytes or instructions having more than three prefixes will result in a MSROM transfer unconditionally. Those instructions will experience two cycles of delay. The inability to execute things [[out-of-order]] eliminates lots of optimization opportunities at this stage. One thing Bonnell can do is lockstep instructions that can be execute simultaneously such as in the case of instructions that performance a memory access along an arithmetic operation. In those instances Bonnell will issue the instruction as if it were two separate instructions executing simultaneously. In addition, only one [[x87]] instruction can be decoded per cycle.
+
Some x86 instructions are simply too complex to handle directly. Those selected few get diverted into the microcode sequencer for decoding producing much more sane RISCish instructions at the cost of 2 additional cycles. Intel estimates that only 5% of common software require instructions to be split up. The inability to execute things [[out-of-order]] eliminates lots of optimization opportunities at this stage. One thing Bonnell can do is lockstep instructions that can be execute simultaneously such as in the case of instructions that performance a memory access along an arithmetic operation. In those instances Bonnell will issue the instruction as if it were two separate instructions executing simultaneously.
  
 
Because Bonnell has support for {{intel|Hyper-Threading}}, Intel's brand name for their own [[simultaneous multithreading]] technology, a number of modifications had to be done. The [[prefetch buffer]] and the [[instruction queue]] have been duplicated for each thread.
 
Because Bonnell has support for {{intel|Hyper-Threading}}, Intel's brand name for their own [[simultaneous multithreading]] technology, a number of modifications had to be done. The [[prefetch buffer]] and the [[instruction queue]] have been duplicated for each thread.

Please note that all contributions to WikiChip may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see WikiChip:Copyrights for details). Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)