Editing intel/microarchitectures/goldmont

{{intel title|Goldmont|arch}}
{{microarchitecture
| name          = Goldmont
| designer      = Intel
| manufacturer  = Intel
| introduction  = August 30, 2016
| phase-out     =
| process       = 14 nm
| cores         = 2
| cores 2       = 4
| cores 3       = 8

| pipeline      = Yes
| type          = Superscalar
| OoOE          = Yes
| speculative   = Yes
| renaming      = Yes
| isa           = IA-32
| isa 2         = x86-64
| stages min    = 12
| stages max    = 14
| issues        = 3

| inst          = Yes
| feature       = 
| extension     = MOVBE
| extension 2   = MMX
| extension 3   = SSE
| extension 4   = SSE2
| extension 5   = SSE3
| extension 6   = SSSE3
| extension 7   = SSE4.1
| extension 8   = SSE4.2
| extension 9   = POPCNT
| extension 10  = AES
| extension 11  = PCLMUL
| extension 12  = RDRND

| cache         = Yes
| l1i           = 32 KiB
| l1i per       = Core
| l1i desc      = 8-way set associative
| l1d           = 24 KiB
| l1d per       = Core
| l1d desc      = 6-way set associative
| l2            = 1 MiB
| l2 per        = 2 Cores
| l2 desc       = 16-way set associative

| core names       = Yes
| core name        = Apollo Lake
| core name 2      =
| core name N      =

| succession       = Yes
| predecessor      = Airmont
| predecessor link = intel/microarchitectures/airmont
| successor        =
| successor link   =
}}
'''Goldmont''' is [[Intel]]'s [[14 nm]] [[microarchitecture]] of [[system on chip]]s for the ultra-low power (ULP) devices. Goldmont-based processors and SoCs are part of the {{intel|Atom}}, {{intel|Pentium (2009)|Pentium}}, and {{intel|Celeron}} families. Goldmont superseded {{intel|Airmont}} in August of 2016. With Goldmont, Intel stopped targeting smartphones altogether, cancelling the related cores and SKUs.
[[File:Atom E3900 SoC Front.png|right|thumb|250px|Intel Atom E3900 SoC series]]

== Codenames ==
{| class="wikitable"
|-
! Platform !! Core !! Target
|-
| {{intel|Apollo Lake}} || {{intel|Apollo Lake}} || Tablets, Entry-level PCs
|- style="text-decoration: line-through;"
| {{intel|Willow Trail}} || {{intel|Willow Trail}} || Lightweight Tablets & high-end smartphone
|- style="text-decoration: line-through;"
| {{intel|Morganfield}} || {{intel|Broxton}} || Smartphone
|}

== Process Technology ==
{{main|intel/microarchitectures/broadwell#Process_Technology|l1=Broadwell § Process Technology}}
Goldmont-based chips are manufactured on Intel's [[14 nm process]].

== Architecture ==

=== Key changes from {{intel|Airmont}} ===
* Pipeline
** Compared to Airmont, Goldmont is a 3-issue core.
** Throughput
*** NOPs, MOVs and many ALU operations have 3 op/cycle throughput for 16, 32 and 64-bit registers. (8-bit ALU ops throughput is 2, 1.5 or 1 op per cycle).
*** ADC, SBB have 0.5 op/cycle throughput, unchanged from Airmont.
*** INC, DEC, BTx, shift ops are not faster than on Airmont, 8-bit shifts are slightly slower (0.66 op/cycle instead of 1).
*** Rotate-through-carry (RCL, RCR) throughput continues to be slow and is slightly slower (used to be ~10 cycles per op, now ~12).
*** 16- and 64-bit shift-double (SHLD, SHRD) throughput continues to be slow and is slightly slower (used to be ~10 cycles per op, now ~14) than on Airmont. (32-bit SHLD, SHRD are fast: 2-4 cycles).
*** Variable shifts and rotates (SHL r32, CL etc) latency increased from 1 cycle to 2 cycles.
*** Bit scan (BSF, BSR) throughput improved from 10 to 8 cycles per op.
*** MUL throughput is better by 1 cycle (used to be 5/7 cycles for 32/64-bit mul, now 4/6).
*** DIV is more than twice as fast as Airmont, 13 cycles for most divides, 128-bit/64-bit are ~42 cycles.
*** PUSH to POP forwarding is improved.
*** REP MOVS streaming copy is twice as fast: now ~26 bytes/cycle.
*** REP STOS fill is not improved: ~9 bytes/cycle.
*** Some vector instructions are faster, but like on {{\\|Airmont}}, none have throughput >2 op/cycle. This includes often used ops like adds and multiplies:
**** MULPS and MULPD have 4 cycle latency and 1 op/cycle throughput (used to have L5 and T0.5).
**** ADDPD has 3 cycle latency and 1 op/cycle throughput (used to have L4 and T0.5).
*** CRC32 instruction throughput improved from 6 cycles/op to 1 cycle/op, latency is halved from 6 to 3.
* Gen 9 GPUs
** {{intel|HD Graphics 400}} '''→''' {{intel|HD Graphics 500}} (12 Execution Units, no change)
** {{intel|HD Graphics 405}} '''→''' {{intel|HD Graphics 505}} (18 Execution Units, up from 16)

=== Block Diagram ===
{{empty section}}

=== Memory Hierarchy ===
* Cache
** Hardware prefetchers
** L1 Cache:
*** 32 [[KiB]] 8-way [[set associative]] instruction, 64 B line size
*** 24 KiB 6-way set associative data, 64 B line size
*** Per core
** L2 Cache:
*** 1 MiB 16-way set associative, 64 B line size
*** Per 2 cores
** L3 Cache:
*** No level 3 cache
** RAM
*** Maximum of 1 [[GiB]], 2 GiB, 4 GiB, 8 GiB
*** dual 32-bit channels, 1 or 2 ranks per channel

=== Multithreading ===
Goldmont, like {{\\|Airmont}} has no support for Intel Hyper-Threading Technology.

== Die Shot ==
Intel {{intel|Atom}} E3900 SoC series:
:[[File:atom e3900 die shot.jpg|650px]]

== All Goldmont Chips ==
<!-- NOTE: 
           This table is generated automatically from the data in the actual articles.
           If a microprocessor is missing from the list, an appropriate article for it needs to be
           created and tagged accordingly.

           Missing a chip? please dump its name here: http://en.wikichip.org/wiki/WikiChip:wanted_chips
-->
<table class="wikitable sortable">
<tr><th colspan="12" style="background:#D6D6FF;">Goldmont Chips</th></tr>
<tr><th colspan="9">Main processor</th><th colspan="3">IGP</th></tr>
<tr><th>Model</th><th>Family</th><th>Platform</th><th>Core</th><th>Launched</th><th>SDP</th><th>TDP</th><th>Freq</th><th>Max Mem</th><th>Name</th><th>Freq</th><th>Max Freq</th></tr>
{{#ask: [[Category:microprocessor models by intel]] [[microarchitecture::Goldmont]]
 |?full page name
 |?model number
 |?microprocessor family
 |?platform
 |?core name
 |?first launched
 |?sdp
 |?tdp
 |?base frequency
 |?max memory
 |?integrated gpu
 |?integrated gpu base frequency
 |?integrated gpu max frequency
 |format=template
 |template=proc table 2
 |userparam=13
 |mainlabel=-
}}
{{table count|col=12|ask=[[Category:microprocessor models by intel]] [[microarchitecture::Goldmont]]}}
</table>
codename	Goldmont +
core count	2 +, 4 +, 8 +, 12 + and 16 +
designer	Intel +
first launched	August 30, 2016 +
full page name	intel/microarchitectures/goldmont +
instance of	microarchitecture +
instruction set architecture	x86-64 +
manufacturer	Intel +
microarchitecture type	CPU +
name	Goldmont +
pipeline stages (max)	14 +
pipeline stages (min)	12 +
process	14 nm (0.014 μm, 1.4e-5 mm) +