From WikiChip
Editing intel/microarchitectures/sunny cove

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

This page supports semantic in-text annotations (e.g. "[[Is specified as::World Heritage Site]]") to build structured and queryable content provided by Semantic MediaWiki. For a comprehensive description on how to use annotations or the #ask parser function, please have a look at the getting started, in-text annotation, or inline queries help pages.

Latest revision Your text
Line 114: Line 114:
 
[[File:skylake - sunny cove changes block.jpg|thumb|right|Skylake to Sunny Cove changes]][[File:sunny cove enhancements.jpg|thumb|right|Sunny Cove enhancements]][[File:sunny cove buffer capacities.png|thumb|right|Sunny Cove buffers]]
 
[[File:skylake - sunny cove changes block.jpg|thumb|right|Skylake to Sunny Cove changes]][[File:sunny cove enhancements.jpg|thumb|right|Sunny Cove enhancements]][[File:sunny cove buffer capacities.png|thumb|right|Sunny Cove buffers]]
 
* Performance
 
* Performance
** [[IPC]] uplift ([[Intel]] self-reported average 18-20% IPC across proxy benchmarks such as [[SPEC CPU2006]]/[[SPEC CPU2017]])
+
** [[IPC]] uplift ([[Intel]] self-reported average 18% IPC across proxy benchmarks such as [[SPEC CPU2006]]/[[SPEC CPU2017]])
 
 
 
* Front-end
 
* Front-end
** 1.5x larger µOP cache (2.3K entries, up from 1536)
+
** 1.5x larger µOP cache (2.25k entries, up from 1536)
 
** Smarter [[prefetchers]]
 
** Smarter [[prefetchers]]
 
** Improved [[branch predictor]]
 
** Improved [[branch predictor]]
 
** ITLB
 
** ITLB
*** 2x 2M page entries (16 entries, up from 8)
+
*** Double 2M page entries (16 entries, up from 8)
 
** Larger IDQ (70 µOPs, up from 64)
 
** Larger IDQ (70 µOPs, up from 64)
 
** LSD can detect up to 70 µOP loops (up from 64)
 
** LSD can detect up to 70 µOP loops (up from 64)
 
* Back-end
 
* Back-end
** Wider allocation (6-way, up from 5-way in skylake and 4-way in broadwell)
+
** Wider allocation (5-way, up from 4-way)
** Delivery Throughout remain 6 uops, same as Skylake.
 
** Wider decoding width with an additional simple decoder is added (from 3 simple + 1 complex in skylake’s 4 way wide decoder  to 4 simple + 1 complex in Sunny cove 5 way wide decoder)
 
 
** 1.6x larger ROB (352, up from 224 entries)
 
** 1.6x larger ROB (352, up from 224 entries)
 
** Scheduler
 
** Scheduler
*** 1.65x larger scheduler (160-entry, up from 97 entries)
+
*** Larger scheduler (160, up from 97 entries)
 
*** Larger dispatch (10-way, up from 8-way)
 
*** Larger dispatch (10-way, up from 8-way)
** 1.55x larger integer register file (280-entry, up from 180)
 
** 1.33x larger vector register file (224-entry, up from 168)
 
** Distributed scheduling queues (4 scheduling queues, up from 2)
 
*** New dedicated queue for store data
 
*** Replaced 2 generic AGUs with two load AGUs
 
*** Load/Store pair have dedicated queues
 
**** New paired store capabilities
 
 
* Execution Engine
 
* Execution Engine
 
** Execution ports rebalanced
 
** Execution ports rebalanced
 
** 2x store data ports (up from 1)
 
** 2x store data ports (up from 1)
 
** 2x store address AGU (up from 1)
 
** 2x store address AGU (up from 1)
 +
** New paired store capabilities
 +
** Replaced 2 generic AGUs with two load AGUs
 
* Memory subsystem
 
* Memory subsystem
** Data Cache
 
*** DTLB now split for load and stores
 
*** Store
 
**** DTLB 4 KiB TLB competitively shared (from fixed partitioning)
 
**** DTLB 2 MiB / 4 MiB TLB competitively shared (from fixed partitioning)
 
**** 2x larger DTLB 1 GiB page entries (8-entry, up from 4)
 
*** Load
 
**** New DTLB store
 
**** 16-entry, all page sizes
 
** STLB
 
*** Single unified TLB for all pages (from 4 KiB+2/4 MiB and seperate 1 GiB)
 
*** STLB uses dynamic partitioning (from partition fixed partitioning)
 
 
** LSU
 
** LSU
 
*** 1.8x more inflight loads (128, up from 72 entries)
 
*** 1.8x more inflight loads (128, up from 72 entries)
Line 162: Line 142:
 
** 2x larger L2 cache (512 KiB, up from 256 KiB)
 
** 2x larger L2 cache (512 KiB, up from 256 KiB)
 
*** Larger STLBs
 
*** Larger STLBs
**** 1.33x larger 4k table (2048 entries, up from 1536)
+
**** Larger 1G table (1024-entry, up from 16)
 +
**** Larger 4k table (2048 entries, up from 1536)
 +
**** New 1,024-entry 2M/4M table
 
** 5-Level Paging
 
** 5-Level Paging
 
*** Large virtual address (57 bits, up from 48 bits)
 
*** Large virtual address (57 bits, up from 48 bits)
Line 195: Line 177:
  
 
=== Block diagram ===
 
=== Block diagram ===
:[[File: Sunny_cove_block_diagram.png|950px]]
+
:[[File:sunny cove block diagram.svg|950px]]
 
 
=== Memory Hierarchy ===
 
* Cache
 
** L0 µOP cache:
 
*** 2,304 µOPs, 8-way set associative
 
**** 48 sets, 6-µOP line size
 
**** statically divided between threads, per core, inclusive with L1I
 
** L1I Cache:
 
*** 32 [[KiB]], 8-way set associative
 
**** 64 sets, 64 B line size
 
**** shared by the two threads, per core
 
** L1D Cache:
 
*** 48 KiB, 12-way set associative
 
*** 64 sets, 64 B line size
 
*** shared by the two threads, per core
 
*** 4 cycles for fastest load-to-use (simple pointer accesses)
 
**** 5 cycles for complex addresses
 
*** bandwidth
 
**** 2x 64 B/cycle load + 1x64 B/cycle store
 
**** OR 2x32 B/cycle store
 
*** Write-back policy
 
** L2 Cache:
 
*** Client
 
**** Unified, 512 KiB, 8-way set associative
 
**** 1024 sets, 64 B line size
 
*** Server
 
**** Unified, 1,280 KiB, 20-way set associative
 
**** 1024 sets, 64 B line size
 
*** Non-inclusive
 
*** 13 cycles for fastest load-to-use
 
*** 64 B/cycle bandwidth to L1$
 
*** Write-back policy
 
 
 
Sunny Cove TLB consists of a dedicated L1 TLB for instruction cache (ITLB) and another one for data cache (DTLB). Additionally, there is a unified L2 TLB (STLB).
 
* TLBs:
 
** ITLB
 
*** 4 KiB page translations:
 
**** 128 entries; 8-way set associative
 
**** dynamic partitioning
 
*** 2 MiB / 4 MiB page translations:
 
**** 16 entries per thread; fully associative
 
**** Duplicated for each thread
 
** DTLB
 
*** Load
 
**** 4 KiB page translations:
 
***** 64 entries; 4-way set associative
 
***** competitively shared
 
**** 2 MiB / 4 MiB page translations:
 
***** 32 entries; 4-way set associative
 
***** competitively shared
 
**** 1G page translations:
 
***** 8 entries; 8-way set associative
 
***** competitively partition
 
*** Store
 
**** All pages:
 
***** 16 entries; 16-way set associative
 
***** competitively partition
 
** STLB
 
*** All pages:
 
**** 2,048 entire; 16-way set associative
 
**** Parititoning:
 
***** 4 KiB pages can use all 2,048 entries
 
***** 2/4 MiB pages can use 1,024 entries (8-way sets), shared with 4 KiB
 
***** 1 GiB pages can use 1,024 entries (8-way sets), shared with 4 KiB pages
 
  
 
== Overview ==
 
== Overview ==
Sunny Cove is Intel's microarchitecture for their [[big core|big CPU core]] which is incorporated into a number of client and server chips that succeed {{\\|Palm Cove}} (and effectively the {{\\|Skylake (client)|Skylake}} series of derivatives). Sunny Cove is a [[big core]] implemented which is incorporated into numerous chips made by Intel including {{\\|Lakefield}}, {{\\|Ice Lake (Client)}}, and {{\\|Ice Lake (Server)}}, as well as the [[Nervana]] {{nervana|NNP}} accelerator. Sunny Cove introduces a large set of enhancements that improves the performance of legacy code and new code through the extraction of parallelism as well as new features. Those include a deep [[out-of-window]] pipeline, a wider execution back-end, higher load-store bandwidth, lower effective access latencies, and bigger caches.
+
Sunny Cove is Intel's microarchitecture for the CPU core which is incorporated into a number of client and server chips that succeed {{\\|Palm Cove}} (and effectively the {{\\|Skylake (client)|Skylake}} series of derivatives). Sunny Cove is just the core which is implemented in a numerous chips made by Intel including {{\\|Lakefield}}, {{\\|Ice Lake (Client)}}, {{\\|Ice Lake (Server)}}, and the [[Nervana]] {{nervana|NNP}} accelerator. Sunny Cove introduces a large set of enhancements that significantly improves the performance of legacy code and new code through the extraction of parallelism as well as new features. Those include a significantly deep [[out-of-window]] pipeline, a wider execution back-end, higher load-store bandwidth, lower effective access latencies, and bigger caches.
  
 
== Pipeline ==
 
== Pipeline ==
Line 317: Line 235:
 
{{see also|intel/microarchitectures/sandy_bridge_(client)#New_.C2.B5OP_cache_.26_x86_tax|l1=Sandy Bridge § New µOP cache}}
 
{{see also|intel/microarchitectures/sandy_bridge_(client)#New_.C2.B5OP_cache_.26_x86_tax|l1=Sandy Bridge § New µOP cache}}
 
[[File:sunny cove ucache.svg|right|400px]]
 
[[File:sunny cove ucache.svg|right|400px]]
Decoding the variable-length, inconsistent, and complex [[x86]] instructions is a nontrivial task. It's also expensive in terms of performance and power. Therefore, the best way for the pipeline to avoid those things is to simply not decode the instructions. This is the job of the [[µOP cache]] or the Decoded Stream Buffer (DSB). Sunny Cove's µOP cache is organized similarly to all previous generations since its introduction in {{\\|Sandy Bridge}}, however, its size has increased. Sunny Cove increased the cache by 1.5x from 1.5K in {{\\|Skylake}} to over 2.3K. The cache is organized into 48 sets of 8 cache lines with each line holding up to 6 µOP for a total of 2,304 µOPs. As with {{\\|Skylake}}, the µOP cache operates on 64-byte fetch windows. The micro-operation cache is competitively shared between the two threads and can also hold pointers to the microcode. The µOP cache has an average hit rate of 80% or greater.
+
Decoding the variable-length, inconsistent, and complex [[x86]] instructions is a nontrivial task. It's also expensive in terms of performance and power. Therefore, the best way for the pipeline to avoid those things is to simply not decode the instructions. This is the job of the [[µOP cache]] or the Decoded Stream Buffer (DSB). Sunny Cove's µOP cache is organized similarly to all previous generations since its introduction in {{\\|Sandy Bridge}}, however, its size has increased. The cache is organized into 48 sets of 8 cache lines with each line holding up to 6 µOP for a total of 2,304 µOPs. As with {{\\|Skylake}}, the µOP cache operates on 64-byte fetch windows. The micro-operation cache is competitively shared between the two threads and can also hold pointers to the microcode. The µOP cache has an average hit rate of 80% or greater.
  
A hit in the µOP allows for up to 6 µOPs (i.e., entire line) per cycle to be sent directly to the Instruction Decode Queue (IDQ), bypassing all the pre-decoding and decoding that would otherwise have to be done. Whereas the legacy decode path works in 16-byte instruction fetch windows, the µOP cache has no such restriction and can deliver 6 µOPs/cycle corresponding to the much bigger 64-byte window. The higher bandwidth of µOPs greatly improves the numbers of µOP that the back-end can take advantage of in the [[out-of-order]] part of the machine. To better improve this area, Sunny cove increased the [[#Renaming & Allocation|rename and retire]] to 5 µOPs/cycle, one more than {{\\|Skylake}}, increasing the absolute ceiling rate of the out-of-order engine.
+
A hit in the µOP allows for up to 6 µOPs (i.e., entire line) per cycle to be sent directly to the Instruction Decode Queue (IDQ), bypassing all the pre-decoding and decoding that would otherwise have to be done. Whereas the legacy decode path works in 16-byte instruction fetch windows, the µOP cache has no such restriction and can deliver 6 µOPs/cycle corresponding to the much bigger 64-byte window. Previously (e.g., {{\\|Broadwell}}), the bandwidth was lower at 4 µOP per cycle. The 1.5x bandwidth increase greatly improves the numbers of µOP that the back-end can take advantage of in the [[out-of-order]] part of the machine. This change attempts to improve instruction rate by alleviating [[bubbles]], however everything is still hard-limited by the [[#Renaming & Allocation|rename and retire]] which puts an absolute ceiling rate of four fused µOPs per cycle.
  
 
===== Allocation Queue =====
 
===== Allocation Queue =====
Line 329: Line 247:
 
The LSD in Sunny Cove can take advantage of the larger IDQ; capable of detecting loops up to 70 µOPs per thread. The LSD is particularly excellent in for many common algorithms that are found in many programs (e.g., tight loops, intensive calc loops, searches, etc..).
 
The LSD in Sunny Cove can take advantage of the larger IDQ; capable of detecting loops up to 70 µOPs per thread. The LSD is particularly excellent in for many common algorithms that are found in many programs (e.g., tight loops, intensive calc loops, searches, etc..).
  
==== Execution engine ====
+
=== Back-end ===
[[File:sunny cove rob.svg|right|450px]]
+
{{empty section}}
Sunny Cove's back-end or execution engine deals with the execution of [[out-of-order]] operations. Much of the design is inherited from previous architectures such as {{\\|Skylake}} but has been widened to explorer more [[instruction-level parallelism]] opportunities. From the allocation queue instructions are sent to the [[Reorder Buffer]] (ROB) at the rate of up to 6 fused-µOPs each cycle, similar to {{\\|Skylake}}'s.
 
 
 
===== Renaming & Allocation =====
 
Like the front-end, the [[Reorder Buffer]] has been significantly enlarged by 60%, now having the capacity of 352 entries, 128 entries more than {{\\|Skylake}}. Since each ROB entry holds complete µOPs, in practice 352 entries might be equivalent to as much as 525 µOPs depending on the code being executed (e.g. fused load/stores). It is at this stage that [[architectural registers]] are mapped onto the underlying [[physical registers]]. Other additional bookkeeping tasks are also done at this point such as allocating resources for stores, loads, and determining all possible scheduler ports. Register renaming is also controlled by the [[Register Alias Table]] (RAT) which is used to mark where the data we depend on is coming from (after that value, too, came from an instruction that has previously been renamed). In {{intel|microarchitectures|previous microarchitectures}}, the RAT could handle 4 µOPs each cycle. In Sunny Cove this has been increased to five, A 25% increase in the OoO application capabilities. Sunny Cove can now rename any five registers per cycle. This includes the same register renamed five times in a single cycle. Note that the ROB still operates on fused µOPs, therefore 5 µOPs can effectively be as high as 10 µOPs.
 
 
 
It should be noted that there are no special costs involved in splitting up fused µOPs before execution or [[retirement]] and the two fused µOPs only occupy a single entry in the ROB.
 
 
 
Since Sunny Cove performs [[speculative execution]], it can speculate incorrectly. When this happens, the architectural state is invalidated and as such needs to be rolled back to the last known valid state. {{intel|microarchitectures|previous microarchitectures}} had a 48-entry [[Branch Order Buffer]] (BOB) that keeps tracks of those states for this very purpose. It's unknown if that has changed with Sunny Cove.
 
 
 
===== Optimizations =====
 
Sunny Cove has a number of optimizations it performs prior to entering the out-of-order and renaming part. Three of those optimizations include [[Move Elimination]] and [[Zeroing Idioms]], and [[Ones Idioms]]. A Move Elimination is capable of eliminating register-to-register moves (including chained moves) prior to bookkeeping at the ROB, allowing those µOPs to save resources and eliminating them entirely. Eliminated moves are zero latency and are entirely removed from the pipeline. This optimization does not always succeed; when it fails, the operands were simply not ready. On average this optimization is almost always successful (upward of 85% in most cases). Move elimination works on all 32- and 64-bit GP integer registers as well as all 128- and 256-bit vector registers.
 
{| style="border: 1px solid gray; float: right; margin: 10px; padding: 5px; width: 350px;"
 
| [[Zeroing Idiom]] Example:
 
|-
 
| <pre>xor eax, eax</pre>
 
|-
 
| Not only does this instruction get eliminated at the ROB, but it's actually encoded as just 2 bytes <code>31 C0</code> vs the 5 bytes for <code>{{x86|mov}} {{x86|eax}}, 0x0</code> which is encoded as <code>b8 00 00 00 00</code>.
 
|}
 
There are some exceptions that Sunny Cove will not optimize, most dealing with [[signedness]]. [[sign extension|sign-extended]] moves cannot be eliminated and neither can zero-extended from 16-bit to 32/64 big registers (note that 8-bit to 32/64 works). Likewise, in the other direction, no moves to 8/16-bit registers can be eliminated. A move of a register to itself is never eliminated.
 
 
 
When instructions use registers that are independent of their prior values, another optimization opportunity can be exploited. A second common optimization performed in Sunny Cove around the same time is [[Zeroing Idioms]] elimination. A number common zeroing idioms are recognized and consequently eliminated in much the same way as the move eliminations are performed. Sunny Cove recognizes instructions such as <code>{{x86|XOR}}</code>, <code>{{x86|PXOR}}</code>, and <code>{{x86|XORPS}}</code> as zeroing idioms when the [[source operand|source]] and [[destination operand|destination]] operands are the same. Those optimizations are done at the same rate as renaming during renaming (at 4 µOPs per cycle) and the register is simply set to zero.
 
 
 
The [[ones idioms]] is another dependency breaking idiom that can be optimized. In all the various {{x86|PCMPEQ|PCMPEQx}} instructions that perform packed comparison the same register with itself always set all bits to one. On those cases, while the µOP still has to be executed, the instructions may be scheduled as soon as possible because the current state of the register need not be known.
 
 
 
===== Scheduler =====
 
[[File:sunny cove scheduler.svg|right|500px]]
 
The scheduler size itself has likely increased with Sunny cove, although its exact capacity was not disclosed. Intel increased the schedule by 50% from 64 to 97 entries in {{\\|Skylake}}, therefore it's reasonable to expect Sunny Cove to be greater than 125 entries. Those entries are competitively shared between the two threads. Sunny Cove continues with a unified design; this is in contrast to designs such as [[AMD]]'s {{amd|Zen|l=arch}} which uses a split design each one holding different types of µOPs. Scheduler includes the two register files for integers and vectors. It's in those [[register files]] that output operand data is stored. In Skylake, the [[integer]] [[register file]] was 180-entry deep. It's unknown if that has changed and by how much on Sunny Cove.
 
 
 
At this point µOPs are no longer fused and will be dispatched to the execution units independently. The scheduler holds the µOPs while they wait to be executed. A µOP could be waiting on an operand that has not arrived (e.g., fetched from memory or currently being calculated from another µOPs) or because the execution unit it needs is busy. Once the µOP is ready, it is dispatched through its designated port. The scheduler will send the oldest ready µOP to be executed on each of the eight ports each cycle.
 
 
 
The scheduler on Sunny Cove enlarged further with two additional ports that deal with memory operations, making it 25% wider than {{\\|Skylake}}. Up to 10 operations may be dispatched each cycle. On the arithmetic side of the execution engine, the four workhorse ports were augmented with more functionality. On the vector side, Sunny Cove retains the three FMAs and ALUs. One of the key changes here is the addition of a new shuffle unit on Port 1 for moving data within a register.
 
  
====== Scheduler Ports & Execution Units ======
+
{{work-in-progress}}
<table class="wikitable">
 
<tr><th colspan="2">Scheduler Ports Designation</th></tr>
 
<tr><th rowspan="5">Port 0</th><td>Integer/Vector Arithmetic, Multiplication, Logic, Shift, and String ops</td></tr>
 
<tr><td>[[FP]] Add, [[Multiply]], [[FMA]]</td></tr>
 
<tr><td>Integer/FP Division and [[Square Root]]</td></tr>
 
<tr><td>[[AES]] Encryption</td></tr>
 
<tr><td>Branch2</td></tr>
 
<tr><th rowspan="2">Port 1</th><td>Integer/Vector Arithmetic, Multiplication, Logic, Shift, and Bit Scanning</td></tr>
 
<tr><td>[[FP]] Add, [[Multiply]], [[FMA]]</td></tr>
 
<tr><th rowspan="3">Port 5</th><td>Integer/Vector Arithmetic, Logic</td></tr>
 
<tr><td>Vector Permute</td></tr>
 
<tr><td>[[x87]] FP Add, Composite Int, CLMUL</td></tr>
 
<tr><th rowspan="2">Port 6</th><td>Integer Arithmetic, Logic, Shift</td></tr>
 
<tr><td>Branch</td></tr>
 
<tr><th>Port 2</th><td>Load AGU</td></tr>
 
<tr><th>Port 3</th><td>Load AGU</td></tr>
 
<tr><th>Port 4</th><td>Store Data</td></tr>
 
<tr><th>Port 7</th><td>Store AGU</td></tr>
 
<tr><th>Port 8</th><td>Store AGU</td></tr>
 
<tr><th>Port 9</th><td>Store Data</td></tr>
 
</table>
 
 
 
{| class="wikitable collapsible collapsed"
 
|-
 
! colspan="3" | Execution Units
 
|-
 
! Execution Unit !! # of Units !! Instructions
 
|-
 
| ALU || 4 || add, and, cmp, or, test, xor, movzx, movsx, mov, (v)movdqu, (v)movdqa, (v)movap*, (v)movup*
 
|-
 
| SHFT || 2 || sal, shl, rol, adc, sarx, adcx, adox, etc.
 
|-
 
| Slow Int || 1 || mul, imul, bsr, rcl, shld, mulx, pdep, etc.
 
|-
 
| BM 2 andn, bextr, blsi, blsmsk, bzhi, etc.
 
|-
 
| Vec ALU || 3 || (v)pand, (v)por, (v)pxor, (v)movq, (v)movq, (v)movap*, (v)movup*, (v)andp*, (v)orp*, (v)paddb/w/d/q, (v)blendv*, (v)blendp*, (v)pblendd
 
|-
 
| Vec_Shft || 2 || (v)psllv*, (v)psrlv*, vector shift count in imm8
 
|-
 
| Vec Add || 2 || (v)addp*, (v)cmpp*, (v)max*, (v)min*, (v)padds*, (v)paddus*, (v)psign, (v)pabs, (v)pavgb, (v)pcmpeq*, (v)pmax, (v)cvtps2dq, (v)cvtdq2ps, (v)cvtsd2si, (v)cvtss2si
 
|-
 
| Shuffle || 2 || (v)shufp*, vperm*, (v)pack*, (v)unpck*, (v)punpck*, (v)pshuf*, (v)pslldq, (v)alignr, (v)pmovzx*, vbroadcast*, (v)pslldq, (v)psrldq, (v)pblendw
 
|-
 
| Vec Mul || 2 || (v)mul*, (v)pmul*, (v)pmadd*
 
|-
 
| SIMD Misc || 1 || STTNI, (v)pclmulqdq, (v)psadw, vector shift count in xmm
 
|-
 
| FP Mov || 1 || (v)movsd/ss, (v)movd gpr
 
|-
 
| DIVIDE || 1 || divp*, divs*, vdiv*, sqrt*, vsqrt*, rcp*, vrcp*, rsqrt*, idiv
 
|-
 
|-
 
|colspan="3" | This table was taken verbatim from the Intel manual. Execution unit mapping to {{x86|MMX|MMX instructions}} are not included.
 
|}
 
 
 
===== Retirement =====
 
Once a µOP executes, or in the case of fused µOPs both µOPs have executed, they can be [[retired]]. Retirement happens [[in-order]] and releases any used resources such as those used to keep track in the [[reorder buffer]]. With retirement/allocation increasing from four to five in Sunny Cove, it's now possible to retire 5 instructions per cycle (5 unfused or 7 with fused ops).
 
  
 
== Die ==
 
== Die ==
 
=== Core ===
 
=== Core ===
* [[Intel 10 nm process]]
+
* [[10 nm process|10nm+ process]]
* Core from an {{\\|Ice Lake (client)}} SoC
+
* Core from an {{\\|Ice Lake (client)|Ice Lake}} SoC
 
* ~6.91 mm² die size
 
* ~6.91 mm² die size
 
** ~3.5 mm x ~1.97 mm
 
** ~3.5 mm x ~1.97 mm
Line 437: Line 266:
  
 
:[[File:ice lake die core 2.png|500px]]
 
:[[File:ice lake die core 2.png|500px]]
 
 
 
* [[Intel 10 nm process]]
 
* Core from an {{\\|Ice Lake (server)}} SoC
 
 
:[[File:ice lake core die.png|200px]]
 
  
 
=== Core group ===
 
=== Core group ===
* [[Intel 10 nm process]]
+
* [[10 nm process|10nm+ process]]
 
* Quad-core from an {{\\|Ice Lake (client)|Ice Lake}} SoC
 
* Quad-core from an {{\\|Ice Lake (client)|Ice Lake}} SoC
 
* ~30.73 mm² die size
 
* ~30.73 mm² die size

Please note that all contributions to WikiChip may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see WikiChip:Copyrights for details). Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)
codenameSunny Cove +
core count2 +, 4 +, 8 +, 10 +, 12 +, 16 +, 18 +, 20 +, 24 +, 26 +, 28 +, 32 +, 36 +, 38 + and 40 +
designerIntel +
first launched2019 +
full page nameintel/microarchitectures/sunny cove +
instance ofmicroarchitecture +
instruction set architecturex86-64 +
manufacturerIntel +
microarchitecture typeCPU +
nameSunny Cove +
phase-out2021 +
pipeline stages (max)19 +
pipeline stages (min)14 +
process10 nm (0.01 μm, 1.0e-5 mm) +