From WikiChip
Editing amd/microarchitectures/k6

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

This page supports semantic in-text annotations (e.g. "[[Is specified as::World Heritage Site]]") to build structured and queryable content provided by Semantic MediaWiki. For a comprehensive description on how to use annotations or the #ask parser function, please have a look at the getting started, in-text annotation, or inline queries help pages.

Latest revision Your text
Line 21: Line 21:
  
 
== Architecture ==
 
== Architecture ==
* 6-7 stage integer pipeline
+
{{empty section}}
** Fetch
 
** Decode
 
** Micro-op issue
 
** Operand Fetch
 
** Execute 1
 
** Optional Execute 2
 
** Commit
 
* Branch Predictor
 
** 2-level predictor with 8192 entry branch history table
 
*** Does not store target addresses. Target addresses are calculated during instruction decode
 
** 16-entry branch target cache
 
*** Caches 16 bytes of instructions at the branch target, and supplies that directly to the decoders to avoid a 1-cycle penalty
 
** 16-entry return address stack
 
* 32K 2-way L1 Instruction Cache
 
** Also holds 20K of predecode data. Instructions are predecoded as L1 instruction cache is filled
 
** 64 entry TLB
 
* Fetch and Decode
 
** 16 instruction bytes fetched per cycle, either from L1 instruction cache or branch target cache
 
** Decoders can handle the following combinations per clock:
 
*** Short decode: Two x86 instructions that generate up to two micro-ops each
 
*** Vector decode: One x86 instruction that generates up to four micro-ops
 
*** Complex instructions handled by microcode
 
* Out of order execution resources
 
** Scheduler holds 6 groups of 4 micro-ops, or up to 12 x64 instructions
 
*** Receives a group of 4 micro-ops from decoders every cycle. If fewer than 4 micro-ops are generated from the decoders, empty slots are padded with NOPs
 
*** Retires one 4-micro-op group every cycle
 
** 48 integer registers: 8 architectural, 16 scratch, 24 rename
 
** 21 MMX registers: 8 architectural, 1 scratch, 12 rename
 
** Up to 7 outstanding branches
 
** 7 entry store queue
 
* Issues up to 6 micro-ops per cycle. Ports:
 
** Integer X: All ALU ops, including multiplies and divides
 
** Integer Y: Basic ALU ops
 
*** Both integer pipelines can handle MMX/3DNow instructions, but share MMX/3DNow multiplier and MMX shifter functional units. If the two pipelines try to issue operations to a shared functional unit, one operation will stall.
 
** Floating point
 
** Load
 
** Store
 
** Branch
 
* 32K 2-way L1 Data Cache
 
** Dual ported, write back
 
** 128 entry TLB
 
** Load unit has 2-cycle latency
 
[[File:K6 block diagram.png]]
 
  
 
== Die Shot ==
 
== Die Shot ==
Line 100: Line 57:
 
* {{amd|K6}}
 
* {{amd|K6}}
 
* {{intel|P6}}
 
* {{intel|P6}}
 
== References ==
 
* AMD K6-2 Data Sheet [http://www.amd-k6.com/wp-content/uploads/2012/07/AMD_K6-2_Desktop_Datasheet.pdf]
 
* AMD K6 Code Optimization [http://www.ii.uib.no/~osvik/amd_opt/21924d.pdf]
 

Please note that all contributions to WikiChip may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see WikiChip:Copyrights for details). Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)

This page is a member of 1 hidden category:

codenameK6 +
designerAMD + and NexGen +
first launchedApril 2, 1997 +
full page nameamd/microarchitectures/k6 +
instance ofmicroarchitecture +
instruction set architecturex86-32 +
manufacturerAMD +
microarchitecture typeCPU +
nameK6 +
phase-out2000 +
process350 nm (0.35 μm, 3.5e-4 mm) + and 250 nm (0.25 μm, 2.5e-4 mm) +