From WikiChip
Saltwell - Intel
< intel‎ | microarchitectures
Revision as of 17:23, 8 April 2016 by At32Hz (talk | contribs) (Memory Hierarchy)

Edit Values
Saltwell µarch
General Info
ERROR: "atype" is missing!

Saltwell was a microarchitecture for Intel's 32 nm ultra-low power system on chips first introduced in late 2011 for the Atom family. Saltwell is a shrink of Bonnell which also incorporated all support chips on-die. Saltwell, unlike its predecessor was aimed directly at smartphones (as opposed to MIDs).

Codenames

Platform Core Target
Medfield Penwell Smartphones
Cedar Trail Cedarview Netbooks
Clover Trail+ Cloverview Tablets
Bordenville Centerton Microservers
Bordenville Briarwood Microservers
Berryville CE (set-tops)

Architecture

Saltwell's primary goals were:

  1. Improve on Bonnell by getting rid of older support chips
  2. Add enhancements using 32 nm process while transitioning to 22 nm
    1. Improve GPU, power
    2. Burst frequencies

Memory Hierarchy

  • Cache
    • Hardware prefetchers
    • L1 Cache:
      • 32 KB 8-way set associative instruction
        • 1 read and 1 write port
      • 24 KB 6-way set associative data
        • 1 read and 1 write port
      • 8 transistors (instead of 6) to reduce voltage
      • Per core
    • L2 Cache:
      • 512 KB 8-way set associative
      • ECC
      • Shrinkable from 512 KB to 128 KB (2-way)
      • 32B/cycle and 32 outstanding cache requests
      • separate voltage rail, fixed @ 1.05V
      • Per core
    • L3 Cache:
      • No level 3 cache
    • Non-Cache Shared State Memory
      • 256KB low-power SRAM
      • separate voltage plane
      • always-on block that stores architectural states while in various power saving modes
    • RAM
      • Maximum of 1GB, 2 GB, and 4 GB
      • dual 32-bit channels, 1 or 2 ranks per channel

Functional Units

The number of functional units were kept to minimum to cut on power consumption.

  • 2 Integer ALUs (1 for jumps, 1 for shifts)
  • 2 FP ALUs (1 adder, 1 for others)
  • No Integer multiplier & divider

Pipeline

Saltwell has an almost identical pipeline to Bonnell's with a 16-stage pipeline with a 13-stage miss penalty. It's also still a dual-issue superscalar but with in-order execution. Reordering logic is was still omitted due to power and area restrictions.

bonnell pipeline.svg

The longer pipeline allows a more evenly spreading of heat across the chip with more units. This also allows a higher clock rate.

  • Instruction Fetch
    • 3 stages
    • 48 Bytes/Cycle (lower if SMT)
  • Instruction Decode
    • 3 stages
    • Instructions with up to 3 prefixes/Cycle
  • Instruction Dispatch
    • 2 stages
  • Source Operand Read
  • Data Cache Access
    • 3 stages
      • 1 stage for calculating
      • 2 stages for reading cache
  • Execution
    • 2 clusters
      • integers
        • quick cache access due to direct connection
      • floating point & SIMD
  • Exception & MT Handling
    • 2 stages
  • Commit
    • 1 stage

Multithreading

Saltwell has support for multithreading - up to two threads per core. However each thread compete for the same resources which does inherently means they run slower than they would if they were to run alone.

Branch Prediction

  • Two-level adaptive predictor
  • 12-bit branch history register
  • Pattern history table has 8192 entries (shared between threads), twice that of Bonnell
  • Branch buffer target has 128 entries (4-way, 32 sets)
  • Unconditional jumps are ignored
  • Always-taken and never-taken are marked in the table
  • Penalties:
    • 13 stages for miss prediction
    • 7 stages for correct prediction but missing branch target buffer (BTB)

Cores

  • Penwell - SoCs specifically for smartphones
  • Cedarview - SoCs for netbooks
  • Cloverview - SoCs for tablets
  • Centerton - SoCs for Microservers; added support for Intel VT and ECC memory
  • Briarwood - SoCs for Microservers
  • Berryville - SoCs for consumer electronics (e.g. set-tops)