From WikiChip
Difference between revisions of "nvidia/microarchitectures/carmel"
< nvidia

(Created page with "{{nvidia title|Carmel|arch}} {{microarchitecture}}")
 
(Performance claims: single core result, not rate)
 
(11 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
{{nvidia title|Carmel|arch}}
 
{{nvidia title|Carmel|arch}}
{{microarchitecture}}
+
{{microarchitecture
 +
|atype=CPU
 +
|name=Carmel
 +
|designer=Nvidia
 +
|manufacturer=TSMC
 +
|introduction=January 7, 2018
 +
|process=12 nm
 +
|cores=8
 +
|type=Superscalar
 +
|isa=ARMv8
 +
|feature=RAS
 +
|l2=2 MiB
 +
|l2 per=cluster
 +
|l3=4 MiB
 +
|l3 per=complex
 +
|predecessor=Denver 2
 +
|predecessor link=nvidia/microarchitectures/denver 2
 +
}}
 +
'''Carmel''' is a the successor to {{\\|Denver 2}}, an [[ARM]] microarchitecture for [[Nvidia]]'s {{nvidia|Tegra}} series of [[SoCs]].
 +
 
 +
== Process Technology ==
 +
Carmel is integrated into chips fabricated on [[TSMC]] [[12 nm process]] (12FFN)
 +
 
 +
== Architecture ==
 +
Nvidia disclosed very few details regarding Carmel. Carmel is a 10-wide superscalar with each core supporting dual execution mode.
 +
 
 +
=== Key changes from {{\\|Denver 2}} ===
 +
* [[12 nm]] (12FFN)
 +
* ARMv8.2
 +
** ARM RAS standard support
 +
* Wider dispatch (10, up from 7)
 +
 
 +
=== Memory Hierarchy ===
 +
* Cache
 +
** Parity & ECC
 +
** L1
 +
** L2
 +
*** 2 MiB
 +
**** Shared per duplex
 +
** L3
 +
*** 4 MiB
 +
**** Shared by entire cluster
 +
**** Exclusive
 +
 
 +
=== Block Diagram ===
 +
==== CPU Complex ====
 +
[[File:nvidia carmel complex diagram.svg|600px]]
 +
 
 +
== Overview ==
 +
Carmel is a CPU microarchitecture designed by Nvidia for their SoCs. The design consists of an 8-core cluster made of 4 core duplexes. The entire complex has [[cache coherency]] as well as an I/O coherent memory subsystem which is designed for communication with the various other accelerators on their SoCs such as the vision accelerator, deep learning accelerator, multimedia engine, and the GPU.
 +
 
 +
The cores consume 500-1,500 mW per core.
 +
 
 +
=== Performance claims ===
 +
{| class="wikitable"
 +
|-
 +
! SPECint 2000 Rate !! SPECint 2006
 +
|-
 +
| 2700 || 21
 +
|}
 +
 
 +
== Die ==
 +
=== CPU Complex ===
 +
* 8 cores
 +
** 4 duplexes
 +
** shared L3
 +
* ~62.25 mm² die size area
 +
 
 +
:[[File:nvidia carmel complex.png|class=wikichip_ogimage|600px]]
 +
 
 +
 
 +
:[[File:nvidia carmel complex (annotated).png|600px]]
 +
 
 +
=== CPU Duplex ===
 +
* 2 cores
 +
* ~11.4 mm² die size area
 +
 
 +
:[[File:nvidia carmel duplex.png|400px]]
 +
 
 +
 
 +
:[[File:nvidia carmel duplex (annotated).png|400px]]
 +
 
 +
=== Core ===
 +
* ~5.75 mm² die size area
 +
 
 +
:[[File:nvidia carmel core.png|200px]]
 +
 
 +
== Bibliography ==
 +
* IEEE Hot Chips 30 Symposium (HCS) 2018.

Latest revision as of 10:27, 30 April 2019

Edit Values
Carmel µarch
General Info
Arch TypeCPU
DesignerNvidia
ManufacturerTSMC
IntroductionJanuary 7, 2018
Process12 nm
Core Configs8
Pipeline
TypeSuperscalar
Instructions
ISAARMv8
Cache
L2 Cache2 MiB/cluster
L3 Cache4 MiB/complex
Succession

Carmel is a the successor to Denver 2, an ARM microarchitecture for Nvidia's Tegra series of SoCs.

Process Technology[edit]

Carmel is integrated into chips fabricated on TSMC 12 nm process (12FFN)

Architecture[edit]

Nvidia disclosed very few details regarding Carmel. Carmel is a 10-wide superscalar with each core supporting dual execution mode.

Key changes from Denver 2[edit]

  • 12 nm (12FFN)
  • ARMv8.2
    • ARM RAS standard support
  • Wider dispatch (10, up from 7)

Memory Hierarchy[edit]

  • Cache
    • Parity & ECC
    • L1
    • L2
      • 2 MiB
        • Shared per duplex
    • L3
      • 4 MiB
        • Shared by entire cluster
        • Exclusive

Block Diagram[edit]

CPU Complex[edit]

nvidia carmel complex diagram.svg

Overview[edit]

Carmel is a CPU microarchitecture designed by Nvidia for their SoCs. The design consists of an 8-core cluster made of 4 core duplexes. The entire complex has cache coherency as well as an I/O coherent memory subsystem which is designed for communication with the various other accelerators on their SoCs such as the vision accelerator, deep learning accelerator, multimedia engine, and the GPU.

The cores consume 500-1,500 mW per core.

Performance claims[edit]

SPECint 2000 Rate SPECint 2006
2700 21

Die[edit]

CPU Complex[edit]

  • 8 cores
    • 4 duplexes
    • shared L3
  • ~62.25 mm² die size area
nvidia carmel complex.png


nvidia carmel complex (annotated).png

CPU Duplex[edit]

  • 2 cores
  • ~11.4 mm² die size area
nvidia carmel duplex.png


nvidia carmel duplex (annotated).png

Core[edit]

  • ~5.75 mm² die size area
nvidia carmel core.png

Bibliography[edit]

  • IEEE Hot Chips 30 Symposium (HCS) 2018.
codenameCarmel +
core count8 +
designerNvidia +
first launchedJanuary 7, 2018 +
full page namenvidia/microarchitectures/carmel +
instance ofmicroarchitecture +
instruction set architectureARMv8 +
manufacturerTSMC +
microarchitecture typeCPU +
nameCarmel +
pipeline stages10 +
process12 nm (0.012 μm, 1.2e-5 mm) +