From WikiChip
Editing intel/microarchitectures/haswell (client)

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

This page supports semantic in-text annotations (e.g. "[[Is specified as::World Heritage Site]]") to build structured and queryable content provided by Semantic MediaWiki. For a comprehensive description on how to use annotations or the #ask parser function, please have a look at the getting started, in-text annotation, or inline queries help pages.

Latest revision Your text
Line 1: Line 1:
 
{{intel title|Haswell|arch}}
 
{{intel title|Haswell|arch}}
 
{{microarchitecture
 
{{microarchitecture
|atype=CPU
+
| atype           = CPU
|name=Haswell
+
| name             = Haswell
|designer=Intel
+
| designer         = Intel
|manufacturer=Intel
+
| manufacturer     = Intel
|introduction=June 4, 2013
+
| introduction     = June 4, 2013
|phase-out=2015
+
| phase-out       = 2015
|process=22 nm
+
| process         = 22 nm
|cores=2
+
| cores           = 2
|cores 2=4
+
| cores 2         = 4
|cores 3=6
+
| cores 3         = 6
|cores 4=8
+
| cores 4         = 8
|cores 5=10
+
| cores 5         = 16
|cores 6=12
+
| cores 6         =  
|cores 7=14
+
 
|cores 8=16
+
| pipeline        = Yes
|cores 9=18
+
| type            = Superscalar
|type=Superscalar
+
| OoOE            = Yes
|speculative=Yes
+
| speculative     = Yes
|renaming=Yes
+
| renaming         = Yes
|stages min=14
+
| isa              = IA-32
|stages max=19
+
| isa 2            = x86-64
|isa=x86-64
+
| stages min       = 14
|extension=MOVBE
+
| stages max       = 19
|extension 2=MMX
+
| issues          = 4
|extension 3=SSE
+
 
|extension 4=SSE2
+
| inst            = Yes
|extension 5=SSE3
+
| feature          =  
|extension 6=SSSE3
+
| extension       = MOVBE
|extension 7=SSE4.1
+
| extension 2     = MMX
|extension 8=SSE4.2
+
| extension 3     = SSE
|extension 9=POPCNT
+
| extension 4     = SSE2
|extension 10=AVX
+
| extension 5     = SSE3
|extension 11=AVX2
+
| extension 6     = SSSE3
|extension 12=AES
+
| extension 7     = SSE4.1
|extension 13=PCLMUL
+
| extension 8     = SSE4.2
|extension 14=FSGSBASE
+
| extension 9     = POPCNT
|extension 15=RDRND
+
| extension 10     = AVX
|extension 16=FMA3
+
| extension 11     = AVX2
|extension 17=BMI
+
| extension 12     = AES
|extension 18=BMI2
+
| extension 13     = PCLMUL
|extension 19=F16C
+
| extension 14     = FSGSBASE
|extension 20=VT-x
+
| extension 15     = RDRND
|extension 21=VT-d
+
| extension 16     = FMA3
|extension 22=TXT
+
| extension 17     = BMI
|extension 23=TSX
+
| extension 18     = BMI2
|l1i=32 KB
+
| extension 19     = F16C
|l1i per=core
+
| extension 20     = VT-x
|l1i desc=8-way set associative
+
| extension 21     = VT-d
|l1d=32 KB
+
| extension 22     = TXT
|l1d per=core
+
| extension 23     = TSX
|l1d desc=8-way set associative
+
 
|l2=256 KB
+
| cache            = Yes
|l2 per=core
+
| l1i             = 32 KB
|l2 desc=8-way set associative
+
| l1i per         = core
|l3=2 MB
+
| l1i desc         = 8-way set associative
|l3 per=core
+
| l1d             = 32 KB
|l3 desc=1.5 MB/core on Iris Pro GPUs equipped models
+
| l1d per         = core
|l4=128 MB
+
| l1d desc         = 8-way set associative
|l4 per=package
+
| l2               = 256 KB
|l4 desc=on Iris Pro GPUs only
+
| l2 per           = core
|core name=Haswell H
+
| l2 desc         = 8-way set associative
|core name 2=Haswell E
+
| l3               = 1.5 MB
|core name 3=Haswell EP
+
| l3 per           = core
|core name 4=Haswell EX
+
| l3 desc         =  
|core name 5=Haswell DT
+
| l4               = 128 MB
|core name 6=Haswell MB
+
| l4 per           = package
|core name 7=Haswell ULT
+
| l4 desc         = on Iris Pro GPUs only
|core name 8=Haswell ULX
+
 
|predecessor=Ivy Bridge
+
| core names      = Yes
|predecessor link=intel/microarchitectures/ivy bridge
+
| core name       = Haswell H
|successor=Broadwell
+
| core name 2     = Haswell E
|successor link=intel/microarchitectures/broadwell
+
| core name 3     = Haswell EP
|pipeline=Yes
+
| core name 4     = Haswell EX
|OoOE=Yes
+
| core name 5     = Haswell DT
|issues=4
+
| core name 6     = Haswell MB
|inst=Yes
+
| core name 7     = Haswell ULT
|cache=Yes
+
| core name 8     = Haswell ULX
|core names=Yes
+
 
|succession=Yes
+
| succession      = Yes
 +
| predecessor     = Ivy Bridge
 +
| predecessor link = intel/microarchitectures/ivy bridge
 +
| successor       = Broadwell
 +
| successor link   = intel/microarchitectures/broadwell
 
}}
 
}}
 
'''Haswell''' ('''HSW''') is [[Intel]]'s  [[microarchitecture]] based on the [[22 nm process]] for mobile, desktops, and servers. Haswell, which was introduced in 2013, became the successor to {{\\|Ivy Bridge}}. Haswell is named after [[wikipedia:Haswell, Colorado|Haswell, Colorado]] (Originally Molalla after [[wikipedia:Molalla, Oregon|Molalla, Oregon]], it was later renamed due to the difficult pronunciation). In 2014 Intel introduced Haswell's successor, {{\\|Broadwell}}.
 
'''Haswell''' ('''HSW''') is [[Intel]]'s  [[microarchitecture]] based on the [[22 nm process]] for mobile, desktops, and servers. Haswell, which was introduced in 2013, became the successor to {{\\|Ivy Bridge}}. Haswell is named after [[wikipedia:Haswell, Colorado|Haswell, Colorado]] (Originally Molalla after [[wikipedia:Molalla, Oregon|Molalla, Oregon]], it was later renamed due to the difficult pronunciation). In 2014 Intel introduced Haswell's successor, {{\\|Broadwell}}.
Line 111: Line 115:
  
 
== Architecture ==
 
== Architecture ==
While sharing a lot of similarities with its predecessor {{\\|Ivy Bridge}}, Haswell introduces many new enhancements and features. Haswell is the first desktop-line of x86s by Intel tailored for a [[system on chip]] architecture. This is a significant move that will continue to be developed over the next couple of microarchitectures. Overall Haswell shares the same basic flow as {{\\|Sandy Bridge}} and {{\\|Ivy Bridge|Ivy}} but expands on them considerably in the execution engine with wider execution units and additional scheduler ports.
+
While sharing a lot of similarities with its predecessor {{\\|Ivy Bridge}}, Haswell introduces many new enhancements and features. Haswell is the first desktop-line of x86s by Intel tailored for a [[system on chip]] architecture. This is a significant move that will continue to be developed over the next couple of microarchitectures. Overall Haswell shares the same basic flow as {{\\|Sandy Bridge}} and {{\\|Ivy Bridge|Ivy}} but expends on them considerably in the execution engine with wider execution units and additional scheduler ports.
  
 
=== Key changes from {{\\|Ivy Bridge}} ===
 
=== Key changes from {{\\|Ivy Bridge}} ===
[[File:haswell buff window.png|right|350px]]
 
 
* 3.5x performance/watt over {{\\|Nehalem}}
 
* 3.5x performance/watt over {{\\|Nehalem}}
 
* Platform Controller Hub (PCH)
 
* Platform Controller Hub (PCH)
 
** Shrink from [[65 nm]] to [[32 nm]]
 
** Shrink from [[65 nm]] to [[32 nm]]
 
* Support for DDR4 (server/enthusiast segments)
 
* Support for DDR4 (server/enthusiast segments)
* Fully Integrated Voltage Regulator (FIVR)
+
* Integrated voltage regulator (IVR)
 
* New C6 & C7 sleep states
 
* New C6 & C7 sleep states
 
* Cache
 
* Cache
Line 138: Line 141:
  
 
==== CPU changes ====
 
==== CPU changes ====
Haswell can do many general purpose instructions with 4 ops/cycle throughput. SandyBridge/Ivybridge could do so only for NOPs, CLC, some vector MOVs and some zeroing instructions (SUB, XOR and vector analogs).
+
Haswell can execute more classes of instructions with 4 ops/cycle throughput. SandyBridge/Ivybridge could do so only for NOPs, CLC, some vector MOVs and some zeroing instructions (SUB, XOR and vector analogs).
 
* MOVSX and MOVZX have 4 op/cycle throughput for 8->32, 8->64 and 16->64 bit forms.
 
* MOVSX and MOVZX have 4 op/cycle throughput for 8->32, 8->64 and 16->64 bit forms.
* Many ALU operations have 4 op/cycle throughput for GP registers: XOR, OR, NEG, NOT, ADD, SUB, CMP, AND, etc.
+
* Some ALU operations have 4 op/cycle throughput for 32-bit registers: XOR, OR, NEG, NOT, although not all (ADD, SUB, CMP and AND don't).
 
* Variable shifts and rotates (SHL r32, CL etc) latency increased from 1 cycle to 2 cycles, variable SHLD/SHRD from 2 cycles to 4 cycles.
 
* Variable shifts and rotates (SHL r32, CL etc) latency increased from 1 cycle to 2 cycles, variable SHLD/SHRD from 2 cycles to 4 cycles.
 
* REP MOVS copy is twice as fast: now ~52 bytes/cycle.
 
* REP MOVS copy is twice as fast: now ~52 bytes/cycle.
Line 152: Line 155:
  
 
====New instructions ====
 
====New instructions ====
 +
{{main|#Added instructions|l1=See #Added_instructions for the complete list}}
 
Haswell introduced a number of new instructions:
 
Haswell introduced a number of new instructions:
 
* {{x86|AVX2|<code>AVX2</code>}} - Advanced Vector Extensions 2; an extension that extends most integer instructions to 256 bits vectors.
 
* {{x86|AVX2|<code>AVX2</code>}} - Advanced Vector Extensions 2; an extension that extends most integer instructions to 256 bits vectors.
 +
** Vector Gather supprt
 +
** Any-to-Any permutes
 +
** Vector-Vector Shifts
 
* {{x86|BMI1|<code>BMI1</code>}} - Bit Manipulation Instructions Sets 1
 
* {{x86|BMI1|<code>BMI1</code>}} - Bit Manipulation Instructions Sets 1
 
* {{x86|BMI2|<code>BMI2</code>}} - Bit Manipulation Instructions Sets 2
 
* {{x86|BMI2|<code>BMI2</code>}} - Bit Manipulation Instructions Sets 2
Line 159: Line 166:
 
* {{x86|FMA3|<code>FMA3</code>}} - Floating Point Multiply Accumulate, 3 operands
 
* {{x86|FMA3|<code>FMA3</code>}} - Floating Point Multiply Accumulate, 3 operands
 
* {{x86|TSX|<code>TSX</code>}} - Transactional Synchronization Extensions
 
* {{x86|TSX|<code>TSX</code>}} - Transactional Synchronization Extensions
* {{x86|INVPCID|<code>INVPCID</code>}} - Invalidate Process-Context Identifier
 
* {{x86|LZCNT|<code>LZCNT</code>}} - [[Leading zero count]]
 
  
 
=== Block Diagram ===
 
=== Block Diagram ===
 +
Due to the success of the front end in {{\\|Ivy Bridge}}, very few changes were done in Haswell.
  
==== Individual Core ====
 
 
[[File:haswell block diagram.svg]]
 
[[File:haswell block diagram.svg]]
  
 
=== Memory Hierarchy ===
 
=== Memory Hierarchy ===
The memory hierarchy in Haswell had a number of changes from its predecessor. The cache bandwidth for both load and store have been doubled (64B/cycle for load and 32B/cycle for store; up from 32/16 respectively). Significant enhancements have been done to support the new gather instructions and transactional memory. With Haswell new port 7 which adds an address generation for stores, up to two loads and one store are possible each cycle.
+
The memory hierarchy in Haswell had a number of changes from its predecessor. The cache bandwidth for both load and store have been doubled (64B/cycle for load and 32B/cycle for store; up from 32/16 respectively). Significant enhancements have been done to support the new gather instructions and transactional memory. With haswell new port 7 which adds an address generation for stores, up to two loads and one store are possible each cycle.
  
 
* Cache
 
* Cache
Line 226: Line 231:
  
 
==== Front-end ====
 
==== Front-end ====
The front-end is the complicated part of the microarchitecture as it deals with variable length x86 instructions ranging from 1 to 15 bytes. The main goal here is to fetch and decode correctly the next set of instructions. The caches have not changed in Haswell from {{\\|Ivy Bridge}}, with the [[L1i$]] still 32KB , 8-way set associative shared dynamically by the two threads. Instruction cache instruction fetching remains 16B/cycle. [[TLB]] is also still 128-entries, 4-way for 4KB pages and 8-entries, [[fully associative]] for 2MB page mode. The fetched instructions are then moved on to an instruction queue which has 40 entries, 20 for each thread. Haswell continued to improve the branch misses although the exact details have not been made public.
+
The front-end is the complicated part of the microarchitecture has it deals with variable length x86 instructions ranging from 1 to 15 bytes. The main goal here is to fetch and decode correctly the next set of instructions. The caches have not changed in Haswell from {{\\|Ivy Bridge}}, with the [[L1i$]] still 32KB , 8-way set associative shared dynamically by the two threads. Instruction cache instruction fetching remains 16B/cycle. [[TLB]] is also still 128-entries, 4-way for 4KB pages and 8-entries, [[fully associative]] for 2MB page mode. The fetched instructions are then moved on to an instruction queue which has 40 entries, 20 for each thread. Haswell continued to improve the branch misses although the exact details have not been made public.
  
 
Haswell has the same µOps cache as Ivy Bridge - 1,536 entries organized in 32 sets of 8 cache lines with 6 µOps each. Hits can yield up to 4-µOps/cycle. The cache supports microcoded instructions (being pointers to ROM entries). Cache is shared by the two threads.
 
Haswell has the same µOps cache as Ivy Bridge - 1,536 entries organized in 32 sets of 8 cache lines with 6 µOps each. Hits can yield up to 4-µOps/cycle. The cache supports microcoded instructions (being pointers to ROM entries). Cache is shared by the two threads.
Line 233: Line 238:
  
 
==== Execution engine ====
 
==== Execution engine ====
Continuing with the decoder is the [[register renaming]] stage. This is crucial for out-of-order execution. In this stage the architectural x86 registers get mapped into one of the many physical registers. The integer physical register file (PRF) has been enlarged by 8 addition registers for a total 168. Likewise the FP PRF was extended by 24 registers bringing it too to 168 registers. The larger increase in the FP PRF is likely to accommodate the new {{x86|AVX2}} extension. The [[reorder buffer|ROB]] in Haswell has been increased to 192 entries (from 168 in Ivy) where each entry corresponds to a single µOp. The ROB is fixed split between the two threads. Additional scheduler resources get allocated as well - this includes stores, loads, and branch buffer entries. Note that due to how dependencies are handled, there may be more or less µOps than what was fed in. For the most part, the renamer is unified and deals with both integers and vectors. Resources, however, are partitioned between the two threads. Finally, as a last step, the µOps are matched with a port depending on their intended execution purpose. Up to 4 fused µOps may be renamed and handled per thread per cycle. Both the load and store in-flight units were increased to 72 and 42 entries respectively.
+
Continuing with the decoder is the [[register renaming]] stage. This is crucial for out-of-order execution. In this stage the architectural x86 registers get mapped into one of the many physical registers. The integer physical register file (PRF) has been enlarged by 8 addition registers for a total 168. Likewise the FP PRF was extended by 24 registers bringing it too to 168 registers. The larger increase in the FP PRF is likely to accommodate the new {{x86|AVX2}} extension. The [[reorder buffer|ROB]] in Haswell has been increased to 192 entries (from 168 in Ivy) where each entry corresponds to a single µOp. The ROD is fixed split between the two threads. Additional scheduler resources get allocated as well - this includes stores, loads, and branch buffer entries. Note that due to how dependencies are handled, there may be more or less µOps than what was fed in. For the most part, the renamer is unified and deals with both integers and vectors. Resources, however, are partitioned between the two threads. Finally, as a last step, the µOps are matched with a port depending on their intended execution purpose. Up to 4 fused µOps may be renamed and handled per thread per cycle. Both the load and store in-flight units were increased to 72 and 42 entries respectively.
  
 
Haswell continues to use a unified scheduler for all µOps which holds 60 entries. µOps at this stage sit idle until they are cleared to be  executed via their assigned dispatch port. µOps may be held due to resource unavailability.
 
Haswell continues to use a unified scheduler for all µOps which holds 60 entries. µOps at this stage sit idle until they are cleared to be  executed via their assigned dispatch port. µOps may be held due to resource unavailability.
Line 240: Line 245:
  
 
===== Execution Units =====
 
===== Execution Units =====
Some of the biggest architectural changes were done in the area of the execution units. Haswell widened the scheduler by two ports - one new integer dispatch port and a new memory port bringing the total to 8 µOps/cycle. The various ports have also been rebalanced. The new port 6 adds another Integer ALU designs to improve integer workloads freeing up Port 0 and 1 for vector works. It also adds a second branch unit to lower the congestion for Port 0. The second port that was added, Port 7 adds a new [[address generation unit|AGU]]. This is largely due to the improvements for {{x86|AVX2}} that roughly doubled its throughput. Port 0 had its ALU/Mul/shifter extended to 256-bits; same is true for the vector ALU on port 1 and the ALU/shuffle on port 5. Additionally a 256-bit FMA unit were added to both port 0 and port 1. The change makes it possible for FMAs and FMULs to issue on both ports. In theory, Haswell can peak at over double the performance of {{\|Sandy Bridge}}, with 16 double / 32 single precision [[FLOP]]/cycle + Integer ALU option +  Vector operation.
+
Some of the biggest architectural changes were done in the area of the execution units. Haswell widened the scheduler by two ports - one new integer dispatch port and a new memory port bringing the total to 8 µOps/cycle. The various ports have also been rebalanced. The new port 6 adds another Integer ALU designs to improve integer workloads freeing up Port 0 and 1 for vector works. It also adds a second branch unit to low the congestion Port 0. The second port that was added, Port 7 adds a new [[address generation unit|AGU]]. This is largely due to the improvements for {{x86|AVX2}} that roughly doubled its throughput. Port 0 had its ALU/Mul/shifter extended to 256-bits; same is true for the vector ALU on port 1 and the ALU/shuffle on port 5. Additionally a 256-bit FMA unit were added to both port 0 and port 1. The change makes it possible for FMAs and FMULs to issue on both ports. In theory, Haswell can peak at over double the performance of {{\|Sandy Bridge}}, with 16 double / 32 single precision [[FLOP]]/cycle + Integer ALU option +  Vector operation.
  
 
The scheduler dispatches up to 8 ready µOps/cycle in [[FIFO]] order through the dispatch ports. µOps involving computational operations are sent to ports 0, 1, 5, and 6 to the appropriate unit. Likewise ports 2, 3, 4 and 7 are used for load/store and address calculations.
 
The scheduler dispatches up to 8 ready µOps/cycle in [[FIFO]] order through the dispatch ports. µOps involving computational operations are sent to ports 0, 1, 5, and 6 to the appropriate unit. Likewise ports 2, 3, 4 and 7 are used for load/store and address calculations.
Line 247: Line 252:
 
{{empty section}}
 
{{empty section}}
 
=== Overclocking ===
 
=== Overclocking ===
{{see also|intel/xmp|l1=Intel's XMP}}
 
 
{{oc warning}}
 
{{oc warning}}
  
Overclocking needs to be done on an unlocked part such as the [[Core i7-5820K]], [[Core i7-5930K]], or [[Core i7-5960X]] Extreme Edition. Additionally those chips need to be paired with the Intel X99 Chipset.
+
Overclocking needs to be done on an unlocked part such as the [[Core i7-5820K]], [[Core i7-5930K]], or [[Core i7-5960X]] Extreme Edition. Additionally those chips needs to be paired with the Intel X99 Chipset.
  
 
[[File:haswell oc chips.png|500px|left]]
 
[[File:haswell oc chips.png|500px|left]]
  
The 5930K and the 5820K are [[hexa-core]] parts whereas the [[5960X]] is an octa-core part. Between 28 and 40 [[PCIe]] lanes are possible with a core ratio of up to x80 the [[BCLK]].
+
The 5930K and the 5820K are [[hex-core]] parts whereas the [[5960X]] is an octa-core part. Between 28 and 40 [[PCIe]] lanes are possible with a core ratio of up to x80 the [[BCLK]].
  
 
[[File:haswell bclk.png|300px|right]]
 
[[File:haswell bclk.png|300px|right]]
Haswell provides a Coarsed BCLK ratios of either 100 MHz, 125 MHz, or 167 MHz (this was consequently changed in {{intel|Skylake#Overclocking|Skylake}}). The clock is generated internally by the chipset, but motherboard ODMs could generate it independently. A single BCLK from the PCH is fed in < 1 MHz steps, however in practice the input is very much limited by PCI Express and DMI PLL interface. This works out to 100 MHz ± 5-7% PEG/DMI @ 5:5, 125 MHz ±5-7% PEG/DMI @ 5:4, and 166.66 MHz ±5-7% @ 5:3.
+
Haswell provides a Coarsed BCLK ratios of either 100 MHz, 125 MHz, or 167 MHz (this was consequently changed in {{intel|Skylake#Overclocking|Skylake}}). The clock is generated internally by the chipset, but motherboard ODMs could generate it independently.
 
 
<div style="display: table; padding: 5px;">
 
* '''f<sub>CORE</sub>''' = [[BCLK]] × [Core Ratio]
 
* '''f<sub>RING</sub>''' = BCLK × [Ring Ratio]
 
* '''F<sub>DDR</sub>''' = BCLK × [1.33/1.00] × [DDR Ratio]
 
</div>
 
 
 
All the clock domains in Haswell are derived from the BCLK (also called DMICLK). In the diagram on the right '''(xC)''' refers to the Core Frequency and is represented as a multiple of BCLK (Core Frequency = BCLK × Core Freq Multiplier up to x80). Likewise '''(xM)''' refers to the memory ratio (up to 2667 MT/s in granularity operations of 200 and 266 MHz) and Two additional multipliers to adjust the PEG(PCIe & Graphics)/DMI links which should remain at a nominal frequency of 100 MHz.
 
  
Voltage control is done by Haswell's new FIVER (Full Integrated Voltage Regulator) based architecture. This means that voltage arrives via the V<sub>CCin</sub> input from the motherboard into the processor and onto the voltage regulator (V<sub>CCin</sub> = [[SVID]] 1.8 V Nom up to 2.3 V+). Internally, the various voltage planes are all derived from there. This includes the V<sub>CORE</sub>, V<sub>RING</sub>, and V<sub>SA</sub>. With the memory voltage (V<sub>DDQ</sub> = 1.2 V Nom) provided from the motherboard with to its own rail.
+
All the clock domains in Haswell are derived from the BCLK. In the diagram on the right '''(xC)''' refers to the Core Frequency and is represented as a multiple of BCLK (Core Frequency = BCLK × Core Freq Multiplier up to x80). Likewise '''(xM)''' refers to the memory ratio (up to 2667 MT/s) and Two additional multipliers to adjust the PCIe DMI links which should remain at a nominal frequency of 100 MHz.
  
 
{{clear}}
 
{{clear}}
Line 275: Line 271:
 
Client die come in [[dual-core|2]], [[quad-core|4]], or [[octa-core|8]] cores setup with dual/quad being mainstream models and the [[octa-core]] being the high-end desktop.
 
Client die come in [[dual-core|2]], [[quad-core|4]], or [[octa-core|8]] cores setup with dual/quad being mainstream models and the [[octa-core]] being the high-end desktop.
  
==== Dual-core GT2 ====
+
====Dual-core ====
* 22 nm process
 
* 960,000,000 transistors
 
* 131 mm² die size
 
* 2 CPU cores
 
  
==== Dual-core GT3 ====
+
: [[File:haswell die (dual-core).jpg|850px]]
* 22 nm process
 
* 1,300,000,000 transistors
 
* 181 mm² die size
 
* 2 CPU cores
 
  
: [[File:haswell gt3 die (dual-core).jpg|850px]]
+
====Quad-core ====
 
 
====Quad-core GT2 ====
 
 
* [[22 nm process]]
 
* [[22 nm process]]
 
* 1,400,000,000 transistors
 
* 1,400,000,000 transistors
Line 300: Line 286:
  
 
: [[File:haswell die (quad-core) (annotated).png|850px]]
 
: [[File:haswell die (quad-core) (annotated).png|850px]]
 
====Quad-core GT3 ====
 
* [[22 nm process]]
 
* 1,700,000,000 transistors
 
* 260 mm² die size
 
* 4 CPU cores
 
  
 
====Octa-core ====
 
====Octa-core ====
Line 319: Line 299:
  
 
:[[File:haswell (octa-core) die shot (annotated).png|650px]]
 
:[[File:haswell (octa-core) die shot (annotated).png|650px]]
 
=== Server Die ===
 
 
====Octadeca-core====
 
* [[18 cores]] processor
 
* [[22 nm process]]
 
* 5,690,000,000 transistors
 
* 622 mm² die size
 
 
:[[File:intel xeon e7 v3.jpg|850px]]
 
  
 
== Added instructions ==
 
== Added instructions ==
Line 659: Line 629:
 
           created and tagged accordingly.
 
           created and tagged accordingly.
  
           Missing a chip? please dump its name here: https://en.wikichip.org/wiki/WikiChip:wanted_chips
+
           Missing a chip? please dump its name here: http://en.wikichip.org/wiki/WikiChip:wanted_chips
 
-->
 
-->
{{comp table start}}
+
<table class="wikitable sortable">
<table class="comptable sortable tc6 tc7 tc20 tc21 tc22 tc23 tc24 tc25">
+
<tr><th colspan="12" style="background:#D6D6FF;">Haswell Chips</th></tr>
<tr class="comptable-header"><th>&nbsp;</th><th colspan="19">List of Haswell Processors</th></tr>
+
<tr><th colspan="9">Main processor</th><th colspan="3">IGP</th></tr>
<tr class="comptable-header"><th>&nbsp;</th><th colspan="9">Main processor</th><th colspan="5">{{intel|Turbo Boost}}</th><th>Mem</th><th colspan="3">IGP</th></tr>
+
<tr><th>Model</th><th>µarch</th><th>Platform</th><th>Core</th><th>Launched</th><th>SDP</th><th>TDP</th><th>Freq</th><th>Max Mem</th><th>Name</th><th>Freq</th><th>Max Freq</th></tr>
{{comp table header 1|cols=Launched, Price, Family, Core Name, Cores, Threads, %L2$, %L3$, TDP, %Frequency, 1 Core, 2 Cores, 3 Cores, 4 Cores, Max Mem, GPU, %Frequency, Turbo}}
+
{{#ask: [[Category:microprocessor models by intel]] [[instance of::microprocessor]] [[microarchitecture::Haswell]]
<tr class="comptable-header comptable-header-sep"><th>&nbsp;</th><th colspan="20">[[Uniprocessors]]</th></tr>
 
{{#ask: [[Category:microprocessor models by intel]] [[instance of::microprocessor]] [[microarchitecture::Haswell]] [[max cpu count::1]]
 
|?full page name
 
|?model number
 
|?first launched
 
|?release price
 
|?microprocessor family
 
|?core name
 
|?core count
 
|?thread count
 
|?l2$ size
 
|?l3$ size
 
|?tdp
 
|?base frequency#GHz
 
|?turbo frequency (1 core)#GHz
 
|?turbo frequency (2 cores)#GHz
 
|?turbo frequency (3 cores)#GHz
 
|?turbo frequency (4 cores)#GHz
 
|?max memory#GiB
 
|?integrated gpu
 
|?integrated gpu base frequency
 
|?integrated gpu max frequency
 
|format=template
 
|template=proc table 3
 
|searchlabel=
 
|sort=microprocessor family, model number
 
|order=asc,asc
 
|userparam=20
 
|mainlabel=-
 
|limit=200
 
}}
 
<tr class="comptable-header comptable-header-sep"><th>&nbsp;</th><th colspan="20">[[Multiprocessors]] (2-way)</th></tr>
 
{{#ask: [[Category:microprocessor models by intel]] [[instance of::microprocessor]] [[microarchitecture::Haswell]] [[max cpu count::2]]
 
|?full page name
 
|?model number
 
|?first launched
 
|?release price
 
|?microprocessor family
 
|?core name
 
|?core count
 
|?thread count
 
|?l2$ size
 
|?l3$ size
 
|?tdp
 
|?base frequency#GHz
 
|?turbo frequency (1 core)#GHz
 
|?turbo frequency (2 cores)#GHz
 
|?turbo frequency (3 cores)#GHz
 
|?turbo frequency (4 cores)#GHz
 
|?max memory#GiB
 
|?integrated gpu
 
|?integrated gpu base frequency
 
|?integrated gpu max frequency
 
|format=template
 
|template=proc table 3
 
|searchlabel=
 
|sort=microprocessor family, model number
 
|order=asc,asc
 
|userparam=20
 
|mainlabel=-
 
|limit=200
 
}}
 
<tr class="comptable-header comptable-header-sep"><th>&nbsp;</th><th colspan="20">[[Multiprocessors]] (4-way)</th></tr>
 
{{#ask: [[Category:microprocessor models by intel]] [[instance of::microprocessor]] [[microarchitecture::Haswell]] [[max cpu count::4]]
 
 
  |?full page name
 
  |?full page name
 
  |?model number
 
  |?model number
  |?first launched
+
  |?microarchitecture
  |?release price
+
  |?platform
|?microprocessor family
 
 
  |?core name
 
  |?core name
|?core count
 
|?thread count
 
|?l2$ size
 
|?l3$ size
 
|?tdp
 
|?base frequency#GHz
 
|?turbo frequency (1 core)#GHz
 
|?turbo frequency (2 cores)#GHz
 
|?turbo frequency (3 cores)#GHz
 
|?turbo frequency (4 cores)#GHz
 
|?max memory#GiB
 
|?integrated gpu
 
|?integrated gpu base frequency
 
|?integrated gpu max frequency
 
|format=template
 
|template=proc table 3
 
|searchlabel=
 
|sort=microprocessor family, model number
 
|order=asc,asc
 
|userparam=20
 
|mainlabel=-
 
|limit=200
 
}}
 
<tr class="comptable-header comptable-header-sep"><th>&nbsp;</th><th colspan="20">[[Multiprocessors]] (8-way)</th></tr>
 
{{#ask: [[Category:microprocessor models by intel]] [[instance of::microprocessor]] [[microarchitecture::Haswell]] [[max cpu count::8]]
 
|?full page name
 
|?model number
 
 
  |?first launched
 
  |?first launched
  |?release price
+
  |?sdp
|?microprocessor family
 
|?core name
 
|?core count
 
|?thread count
 
|?l2$ size
 
|?l3$ size
 
 
  |?tdp
 
  |?tdp
  |?base frequency#GHz
+
  |?base frequency
|?turbo frequency (1 core)#GHz
+
  |?max memory
|?turbo frequency (2 cores)#GHz
 
|?turbo frequency (3 cores)#GHz
 
|?turbo frequency (4 cores)#GHz
 
  |?max memory#GiB
 
 
  |?integrated gpu
 
  |?integrated gpu
 
  |?integrated gpu base frequency
 
  |?integrated gpu base frequency
 
  |?integrated gpu max frequency
 
  |?integrated gpu max frequency
 
  |format=template
 
  |format=template
  |template=proc table 3
+
  |template=proc table 2
|searchlabel=
+
  |userparam=13
|sort=microprocessor family, model number
 
|order=asc,asc
 
  |userparam=20
 
 
  |mainlabel=-
 
  |mainlabel=-
|limit=200
 
 
}}
 
}}
{{comp table count|ask=[[Category:microprocessor models by intel]] [[instance of::microprocessor]] [[microarchitecture::Haswell]]}}
+
<tr><th colspan="12">Count: {{#ask:[[Category:microprocessor models by intel]][[instance of::microprocessor]][[microarchitecture::Haswell]]|format=count}}</th></tr>
 
</table>
 
</table>
{{comp table end}}
 
  
 
== References ==
 
== References ==
 
* Hammarlund, Per, et al. "Haswell: The fourth-generation intel core processor." IEEE Micro 34.2 (2014): 6-20.
 
* Hammarlund, Per, et al. "Haswell: The fourth-generation intel core processor." IEEE Micro 34.2 (2014): 6-20.
* Dan Ragland, Overclocking System Architect, 2015 IDF, in San Francisco, Session RPCS001 ("Overclocking 6th Generation Intel® Core™ Processors!"), August 18, 2015
 
  
 
== Documents ==
 
== Documents ==
 
* [[:File:haswell isa extension.pdf|Haswell new ISA extensions]]
 
* [[:File:haswell isa extension.pdf|Haswell new ISA extensions]]

Please note that all contributions to WikiChip may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see WikiChip:Copyrights for details). Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)

This page is a member of 1 hidden category:

codenameHaswell +
core count2 +, 4 +, 6 +, 8 +, 16 +, 10 +, 12 +, 14 + and 18 +
designerIntel +
first launchedJune 4, 2013 +
full page nameintel/microarchitectures/haswell (client) +
instance ofmicroarchitecture +
instruction set architecturex86-64 +
manufacturerIntel +
microarchitecture typeCPU +
nameHaswell +
phase-out2015 +
pipeline stages (max)19 +
pipeline stages (min)14 +
process22 nm (0.022 μm, 2.2e-5 mm) +