From WikiChip
Difference between revisions of "intel/microarchitectures/comet lake"
< intel‎ | microarchitectures

(Architecture)
(Architecture)
Line 175: Line 175:
  
 
{{expand list}}
 
{{expand list}}
 +
 +
=== Block Diagram ===
 +
{{empty section}}
 +
 +
==== Gen9.5 ====
 +
See {{intel|Gen9.5#Gen9.5|l=arch}}.
 +
 +
=== Memory Hierarchy ===
 +
The overall memory structure is identical to {{\\|Skylake}}.
 +
 +
<!-- ===================== START IF YOU CHANGE HERE, CHANGE ON SKYLAKE !! ============================= -->
 +
* Cache
 +
** L0 µOP cache:
 +
*** 1,536 µOPs, 8-way set associative
 +
**** 32 sets, 6-µOP line size
 +
**** statically divided between threads, per core, inclusive with L1I
 +
** L1I Cache:
 +
*** 32 [[KiB]], 8-way set associative
 +
**** 64 sets, 64 B line size
 +
**** shared by the two threads, per core
 +
** L1D Cache:
 +
*** 32 KiB, 8-way set associative
 +
*** 64 sets, 64 B line size
 +
*** shared by the two threads, per core
 +
*** 4 cycles for fastest load-to-use (simple pointer accesses)
 +
**** 5 cycles for complex addresses
 +
*** 64 B/cycle load bandwidth
 +
*** 32 B/cycle store bandwidth
 +
*** Write-back policy
 +
** L2 Cache:
 +
*** Unified, 256 KiB, 4-way set associative
 +
*** Non-inclusive
 +
*** 1024 sets, 64 B line size
 +
*** 12 cycles for fastest load-to-use
 +
*** 64 B/cycle bandwidth to L1$
 +
*** Write-back policy
 +
** L3 Cache/LLC:
 +
*** Up to 2 MiB Per core, shared across all cores
 +
*** Up to 16-way set associative
 +
*** Inclusive
 +
*** 64 B line size
 +
*** Write-back policy
 +
*** Per each core:
 +
**** Read: 32 B/cycle (@ ring [[clock]])
 +
**** Write: 32 B/cycle (@ ring clock)
 +
*** 42 cycles for fastest load-to-use
 +
** Side Cache:
 +
*** 64 MiB & 128 MiB [[eDRAM]]
 +
*** Per package
 +
*** Only on the Iris Pro GPUs
 +
*** Read: 32 B/cycle (@ [[eDRAM]] clock)
 +
*** Write: 32 B/cycle (@ eDRAM clock)
 +
** System [[DRAM]]:
 +
*** 2 Channels
 +
*** 8 B/cycle/channel (@ memory clock)
 +
*** 42 cycles + 51 ns latency
 +
 +
Coffee Lake TLB consists of dedicated L1 TLB for instruction cache (ITLB) and another one for data cache (DTLB). Additionally there is a unified L2 TLB (STLB).
 +
* TLBs:
 +
** ITLB
 +
*** 4 KiB page translations:
 +
**** 128 entries; 8-way set associative
 +
**** dynamic partitioning
 +
*** 2 MiB / 4 MiB page translations:
 +
**** 8 entries per thread; fully associative
 +
**** Duplicated for each thread
 +
** DTLB
 +
*** 4 KiB page translations:
 +
**** 64 entries; 4-way set associative
 +
**** fixed partition
 +
*** 2 MiB / 4 MiB page translations:
 +
**** 32 entries; 4-way set associative
 +
**** fixed partition
 +
*** 1G page translations:
 +
**** 4 entries; 4-way set associative
 +
**** fixed partition
 +
** STLB
 +
*** 4 KiB + 2 MiB page translations:
 +
**** 1536 entries; 12-way set associative
 +
**** fixed partition
 +
*** 1 GiB page translations:
 +
**** 16 entries; 4-way set associative
 +
**** fixed partition
 +
<!-- ===================== END IF YOU CHANGE HERE, CHANGE ON SKYLAKE !! ============================= -->
 +
 +
 +
* '''Note:''' STLB is incorrectly reported as "6-way" by CPUID leaf 2 (EAX=02H). Coffee Lake erratum CFL084 recommends software to simply ignore that value.
 +
 +
== Overview ==
 +
{{empty section}}
 +
 +
== Configurability ==
 +
{{empty section}}
 +
 +
== Graphics ==
 +
{{empty section}}
 +
 +
== Die ==
 +
{{empty section}}
 +
 +
== All Coffee Lake Chips ==
 +
{{empty section}
  
 
== See also ==
 
== See also ==
 
* AMD {{amd|Zen 2|l=arch}}
 
* AMD {{amd|Zen 2|l=arch}}

Revision as of 16:48, 11 May 2020

Edit Values
Comet Lake µarch
General Info
Arch TypeCPU
DesignerIntel
ManufacturerIntel
Process14 nm
Core Configs4
Pipeline
TypeSuperscalar, Superpipeline
OoOEYes
SpeculativeYes
Reg RenamingYes
Stages14-19
Decode5-way
Instructions
ISAx86-64
ExtensionsMOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA3, F16C, BMI, BMI2, VT-x, VT-d, TXT, TSX, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVE, SGX, MPX
Cache
L1I Cache32 KiB/core
8-way set associative
L1D Cache32 KiB/core
8-way set associative
L2 Cache256 KiB/core
4-way set associative
L3 Cache2 MiB/core
Up to 16-way set associative
L4 Cache128 MiB/package
on Iris Pro GPUs only
Cores
Core NamesComet Lake U,
Comet Lake S
Succession

Comet Lake (CML) is Intel's successor to Coffee Lake, an enhanced 14 nm process microarchitecture for mainstream desktops and mobile devices.

For desktop and mobile, Comet Lake is branded as 10th Generation Intel Core i3, Core i5, Core i7, and Core i9 processors.

Codenames

Core Abbrev Description Graphics Target
Comet Lake S CML-S Mainstream performance GT2 Desktop performance to value, AiOs, and minis
Comet Lake U CML-U Ultra-low power GT2 Light notebooks, portable All-in-Ones (AiOs), Minis, and conference room

Brands

Intel is expected to release Comet Lake under 3 main brand families:

Logo Family General Description Differentiating Features
Cores HT AVX AVX2 TBT ECC
core i3 logo (2015).png Core i3 Low-end Performance
core i5 logo (2015).png Core i5 Mid-range Performance
core i7 logo (2015).png Core i7 High-end Performance
Core i9 High-end/Enthusiasts Performance

Release Dates

Comet Lake processors were introduced in a number of phases. Initial mobile processors were launched on August 21, 2019. Intel followed up with the remaining desktop parts on April 30, 2020. Enterprise and vPro models were followed on May 13, 2020.

Compatibility

There are no official drivers by Intel for Windows 7 or Windows 8. Microsoft announced that only Windows 10 will have support for Kaby Lake. Linux added initial support for Kaby Lake starting with Linux Kernel 4.5.

Vendor OS Version Notes
Microsoft Windows Windows 7 No Support
Windows 8 No Support
Windows 10 Support
Linux Linux Kernel 4.5 Initial Support (Fedora 24, Yocto v2.2, ..)
Google Chromium Chromium Support
Wind River VxWorks VxWorks 7 Support

Compiler support

Compiler Arch-Specific Arch-Favorable
ICC -march=skylake -mtune=skylake
GCC -march=skylake -mtune=skylake
LLVM -march=skylake -mtune=skylake
Visual Studio /arch:AVX2 /tune:skylake

CPUID

Further information: Intel CPUIDs
Core Extended
Family
Family Extended
Model
Model Stepping
U 0 0x6 0x8 0xE 0xC
Family 6 Model 142 Stepping 12
S/H 0 0x6 0xA 0x5 0x0-0x5
Family 6 Model 165 Stepping 0-5
Stepping: G0=0, P0=1, R1=2, G1=3, P1=4, Q0=5

Architecture

Key changes from Coffee Lake

  • Enhanced "14nm++" process results in higher turbo frequencies
  • System Architecture
  • Memory
    • Faster memory for mainstream desktops (i.e., Comet Lake S) DDR4-2933 (from DDR4-2666)
  • Packaging

This list is incomplete; you can help by expanding it.

Block Diagram

New text document.svg This section is empty; you can help add the missing info by editing this page.

Gen9.5

See Gen9.5#Gen9.5.

Memory Hierarchy

The overall memory structure is identical to Skylake.

  • Cache
    • L0 µOP cache:
      • 1,536 µOPs, 8-way set associative
        • 32 sets, 6-µOP line size
        • statically divided between threads, per core, inclusive with L1I
    • L1I Cache:
      • 32 KiB, 8-way set associative
        • 64 sets, 64 B line size
        • shared by the two threads, per core
    • L1D Cache:
      • 32 KiB, 8-way set associative
      • 64 sets, 64 B line size
      • shared by the two threads, per core
      • 4 cycles for fastest load-to-use (simple pointer accesses)
        • 5 cycles for complex addresses
      • 64 B/cycle load bandwidth
      • 32 B/cycle store bandwidth
      • Write-back policy
    • L2 Cache:
      • Unified, 256 KiB, 4-way set associative
      • Non-inclusive
      • 1024 sets, 64 B line size
      • 12 cycles for fastest load-to-use
      • 64 B/cycle bandwidth to L1$
      • Write-back policy
    • L3 Cache/LLC:
      • Up to 2 MiB Per core, shared across all cores
      • Up to 16-way set associative
      • Inclusive
      • 64 B line size
      • Write-back policy
      • Per each core:
        • Read: 32 B/cycle (@ ring clock)
        • Write: 32 B/cycle (@ ring clock)
      • 42 cycles for fastest load-to-use
    • Side Cache:
      • 64 MiB & 128 MiB eDRAM
      • Per package
      • Only on the Iris Pro GPUs
      • Read: 32 B/cycle (@ eDRAM clock)
      • Write: 32 B/cycle (@ eDRAM clock)
    • System DRAM:
      • 2 Channels
      • 8 B/cycle/channel (@ memory clock)
      • 42 cycles + 51 ns latency

Coffee Lake TLB consists of dedicated L1 TLB for instruction cache (ITLB) and another one for data cache (DTLB). Additionally there is a unified L2 TLB (STLB).

  • TLBs:
    • ITLB
      • 4 KiB page translations:
        • 128 entries; 8-way set associative
        • dynamic partitioning
      • 2 MiB / 4 MiB page translations:
        • 8 entries per thread; fully associative
        • Duplicated for each thread
    • DTLB
      • 4 KiB page translations:
        • 64 entries; 4-way set associative
        • fixed partition
      • 2 MiB / 4 MiB page translations:
        • 32 entries; 4-way set associative
        • fixed partition
      • 1G page translations:
        • 4 entries; 4-way set associative
        • fixed partition
    • STLB
      • 4 KiB + 2 MiB page translations:
        • 1536 entries; 12-way set associative
        • fixed partition
      • 1 GiB page translations:
        • 16 entries; 4-way set associative
        • fixed partition


  • Note: STLB is incorrectly reported as "6-way" by CPUID leaf 2 (EAX=02H). Coffee Lake erratum CFL084 recommends software to simply ignore that value.

Overview

New text document.svg This section is empty; you can help add the missing info by editing this page.

Configurability

New text document.svg This section is empty; you can help add the missing info by editing this page.

Graphics

New text document.svg This section is empty; you can help add the missing info by editing this page.

Die

New text document.svg This section is empty; you can help add the missing info by editing this page.

All Coffee Lake Chips

{{empty section}

See also