From WikiChip
Difference between revisions of "intel/microarchitectures/broadwell (client)"
m (→Architecture) |
|||
Line 81: | Line 81: | ||
| successor link = intel/microarchitectures/skylake | | successor link = intel/microarchitectures/skylake | ||
}} | }} | ||
− | '''Broadwell''' ('''BDW''') is [[Intel]]'s [[microarchitecture]] based on the [[14 nm process]] for mobile, desktops, and servers. Introduced in early 2015, Broadwell is a [[process shrink]] of {{\\|Haswell}} which introduced several enhancements. | + | '''Broadwell''' ('''BDW''') is [[Intel]]'s [[microarchitecture]] based on the [[14 nm process]] for mobile, desktops, and servers. Introduced in early 2015, Broadwell is a [[process shrink]] of {{\\|Haswell}} which introduced several enhancements. Broadwell is named after [[wikipedia:Broadwell, Illinois|Broadwell, Illinois]]. |
== Codenames == | == Codenames == |
Revision as of 20:56, 13 April 2016
Edit Values | |
Broadwell µarch | |
General Info |
Broadwell (BDW) is Intel's microarchitecture based on the 14 nm process for mobile, desktops, and servers. Introduced in early 2015, Broadwell is a process shrink of Haswell which introduced several enhancements. Broadwell is named after Broadwell, Illinois.
Contents
Codenames
Core | Target |
---|---|
Broadwell Y (BDW-Y) | Core M family, SoC for Smartphones, 2-in-1s Tablets, and notebooks |
Broadwell U (BDW-U) | Core ultrabooks |
Broadwell H (BDW-H) | IoT (QM87, HM86/HM87 Chipsets), All-in-ones |
Broadwell DT (BDW-DT) | Unlocked desktop MPUs |
Broadwell EP (BDW-EP) | Xeon E5, Dual-Processor platform |
Broadwell EX (BDW-EX) | Xeon E5, Multi-Processor platform, QPI |
Broadwell E (BDW-E) | High-End Desktops (HEDT) |
Architecture
Broadwell is for the most part identical to Haswell with several enhancements.
Key changes from Haswell
- ~5% IPC improvement
- FP multiplication instructions has reduced latency (3 cycles, down from 5)
- Affects AVX, SSE, and FP instructions
- CLMUL instructions are now a single μop, improving latency and throughput
- The second-level TLB (STLB)
- Table was enlarged (1,536 entries, up from 1024)
- 1GB page mode (16 entries, 4-ways set associative)
- Larger out-of-order scheduler
- Faster store-to-load forwarding
- Address prediction for branches and returns was improved
- Improved cryptography acceleration instructions
Core features maintained a 2:1 ratio of performance:power.
Graphics
- 50% higher sampler throughput
- Improvements for increased geometry, Z, Pixel Fill
- Direct X 11.2, OpenGL 4.3
- OpenCL 1.2 and 2.0 (with Shared Virtual Memory)
- Up to 24 EUs (20% addition, up from 20 in Haswell), 48 EUs on Iris Pro Graphics
Block Diagram
Memory Hierarchy
- Cache
- Hardware prefetchers
- L1 Cache:
- 32 KB 8-way set associative instruction, 64 B line size
- 32 KB 8-way set associative data, 64 B line size
- Write-back policy
- Per core
- L2 Cache:
- 256 KB 8-way set associative, 64 B line size
- Write-back policy
- Per core
- L3 Cache:
- 1.5 MB
- L4 Cache:
- 128 MB
- eDRAM
- shared with GPU (Crystal Well)
- Iris Pro models only