From WikiChip
Difference between revisions of "intel/microarchitectures/broadwell (client)"
< intel‎ | microarchitectures

Line 62: Line 62:
 
| l3 per        = core
 
| l3 per        = core
 
| l3 desc      =  
 
| l3 desc      =  
 +
| l4            = 128 MB
 +
| l4 per        = package
 +
| l4 desc      = on Iris Pro GPUs only
  
 
| core names      = Yes
 
| core names      = Yes

Revision as of 00:26, 13 April 2016

Edit Values
Broadwell µarch
General Info
ERROR: "atype" is missing!

Broadwell (BDW) is Intel's microarchitecture based on the 14 nm process for mobile, desktops, and servers. Introduced in early 2015, Broadwell is a process shrink of Haswell which introduced several enhancements.

Codenames

Core Target
Broadwell Y (BDW-Y) Core M family, SoC for Smartphones, 2-in-1s Tablets, and notebooks
Broadwell U (BDW-U) Core ultrabooks
Broadwell H (BDW-H) IoT (QM87, HM86/HM87 Chipsets), All-in-ones
Broadwell DT (BDW-DT) Unlocked desktop MPUs
Broadwell EP (BDW-EP) Xeon E5, Dual-Processor platform
Broadwell EX (BDW-EX) Xeon E5, Multi-Processor platform, QPI
Broadwell E (BDW-E) High-End Desktops (HEDT)

Architecture

Broadwell is for the most part identical to Haswell with server enhancements.

Key changes from Haswell

  • ~5% IPC improvement
  • FP multiplication instructions has reduced latency (3 cycles, down from 5)
    • Affects AVX, SSE, and FP instructions
  • CLMUL instructions are now a single μop, improving latency and throughput
  • The second-level TLB (STLB)
    • Table was enlarged (1,536 entries, up from 1024)
    • 1GB page mode (16 entries, 4-ways set associative)
  • Larger out-of-order scheduler
  • Faster store-to-load forwarding
  • Address prediction for branches and returns was improved
  • Improved cryptography acceleration instructions

Core features maintained a 2:1 ratio of performance:power.

Graphics

  • 50% higher sampler throughput
  • Improvements for increased geometry, Z, Pixel Fill
  • Direct X 11.2, OpenGL 4.3
  • OpenCL 1.2 and 2.0 (with Shared Virtual Memory)
  • Up to 24 EUs (20% addition, up from 20 in Haswell), 48 EUs on Iris Pro Graphics

Block Diagram

haswell block diagram.svg

Memory Hierarchy

  • Cache
    • Hardware prefetchers
    • L1 Cache:
      • 32 KB 8-way set associative instruction, 64 B line size
      • 32 KB 8-way set associative data, 64 B line size
      • Write-back policy
      • Per core
    • L2 Cache:
      • 256 KB 8-way set associative, 64 B line size
      • Write-back policy
      • Per core
    • L3 Cache:
      • 1.5 MB
    • L4 Cache: