From WikiChip
Difference between revisions of "samsung/microarchitectures/m5"
< samsung

(Architecture)
m (Reverted edits by 177.101.59.145 (talk) to last revision by 165.225.76.211)
 
(5 intermediate revisions by 4 users not shown)
Line 2: Line 2:
 
{{microarchitecture
 
{{microarchitecture
 
|atype=CPU
 
|atype=CPU
|name=Mongoose 5
+
|name= Lion M5
 
|designer=Samsung
 
|designer=Samsung
 
|manufacturer=Samsung
 
|manufacturer=Samsung
 
|introduction=2020
 
|introduction=2020
|process=8 nm
+
|process=7 nm
 +
|cores=2
 +
|type=Superscalar
 +
|type 2=Superpipeline
 +
|oooe=Yes
 +
|speculative=Yes
 +
|renaming=Yes
 +
|stages=16
 +
|decode=6-way
 +
|isa=ARMv8.2
 +
|l1i=64 KiB
 +
|l1i per=core
 +
|l1i desc=4-way set associative
 +
|l1d=64 KiB
 +
|l1d per=core
 +
|l1d desc=8-way set associative
 +
|l2=512 KiB
 +
|l2 per=core
 +
|l2 desc=8-way set associative
 +
|l3=2 MiB
 +
|l3 per=cluster
 +
|l3 desc=16-way set associative
 
|predecessor=M4
 
|predecessor=M4
 
|predecessor link=samsung/microarchitectures/m4
 
|predecessor link=samsung/microarchitectures/m4
|successor=cancelled
 
 
}}
 
}}
'''Exynos Mongoose 5''' ('''M5''') is the successor to the {{\\|Mongoose 4}}, a [[7 nm]] [[ARM]] microarchitecture designed by [[Samsung]] for their consumer electronics.
+
'''Exynos M5''' ('''Lion''') is the successor to the {{\\|Mongoose 4}}, a [[7 nm]] [[ARM]] microarchitecture designed by [[Samsung]] for their consumer electronics.
 
 
 
 
{{future information}}
 
  
 
== Process Technology ==
 
== Process Technology ==
The M5 is planned to be fabricated on Samsung's [[7 nm process]] (7LPP).
+
The M5 is fabricated on Samsung's [[7 nm process]] (7LPP).
  
 
== Compiler support ==
 
== Compiler support ==
Line 32: Line 49:
  
 
=== Key changes from {{\\|M4}} ===
 
=== Key changes from {{\\|M4}} ===
{{empty section}}
+
* Front end
 +
** Larger [[instruction queue]] (60 entries, up from 48)
 +
** Improved mispredict penalty (15 cycles, down from 16)
 +
* Back end
 +
** LSU execution units reorganized
 +
*** Two new 32b integer ALU pipes
 +
** Floating-point execution units reorganized
 +
*** Three new dedicted {{arm|NEON}} [[dot product]] EUs
 +
{{expand list}}
  
 
=== Block Diagram ===
 
=== Block Diagram ===
 
==== Individual Core ====
 
==== Individual Core ====
  
[[File:mongoose 5 block diagram.svg|900px]]
+
[[File:mongoose 5 block diagram.svg|950px]]
 +
 
 +
=== Memory Hierarchy ===
 +
* Cache
 +
** L1I Caches
 +
*** 64 KiB, 4-way set associative
 +
**** 128 B line size
 +
**** per core
 +
*** Parity-protected
 +
** L1D Cache
 +
*** 64 KiB, 8-way set associative
 +
**** 64 B line size
 +
**** per core
 +
*** 4 cycles for fastest load-to-use
 +
*** 32 B/cycle load bandwidth
 +
*** 16 B/cycle store bandwidth
 +
** L2 Cache
 +
*** 512 KiB, 8-way set associative
 +
*** Inclusive of L1
 +
*** 12 cycles latency
 +
*** 32 B/cycle bandwidth
 +
** L3 Cache
 +
*** 2 MiB, 16-way set associative
 +
**** 1 MiB slice/core
 +
*** Exlusive of L2
 +
*** ~37-cycle typical (NUCA)
 +
** BIU
 +
*** 80 outstanding transactions
 +
 
 +
The M3 TLB consists of dedicated L1 TLB for instruction cache (ITLB) and another one for data cache (DTLB). Additionally, there is a unified L2 TLB (STLB).
 +
 
 +
* TLBs
 +
** ITLB
 +
*** 512-entry
 +
** DTLB
 +
*** 32-entry
 +
*** 512-entry Mid-level DTLB
 +
** STLB
 +
*** 4,096-entry
 +
*** Per core
 +
 
 +
* BPU
 +
** 4K-entry main BTB
 +
** 128-entry µBTB
 +
** 64-entry return stack
 +
** 16K-entry L2 BTB
 +
 
 +
 
 +
== Bibliography ==
 +
* LLVM: lib/Target/AArch64/AArch64SchedExynosM5.td

Latest revision as of 21:10, 27 July 2021

Edit Values
Lion M5 µarch
General Info
Arch TypeCPU
DesignerSamsung
ManufacturerSamsung
Introduction2020
Process7 nm
Core Configs2
Pipeline
TypeSuperscalar, Superpipeline
OoOEYes
SpeculativeYes
Reg RenamingYes
Stages16
Decode6-way
Instructions
ISAARMv8.2
Cache
L1I Cache64 KiB/core
4-way set associative
L1D Cache64 KiB/core
8-way set associative
L2 Cache512 KiB/core
8-way set associative
L3 Cache2 MiB/cluster
16-way set associative
Succession

Exynos M5 (Lion) is the successor to the Mongoose 4, a 7 nm ARM microarchitecture designed by Samsung for their consumer electronics.

Process Technology[edit]

The M5 is fabricated on Samsung's 7 nm process (7LPP).

Compiler support[edit]

Compiler Arch-Specific Arch-Favorable
GCC -mcpu=exynos-m5 -mtune=exynos-m5
LLVM -mcpu=exynos-m5 -mtune=exynos-m5

Architecture[edit]

Key changes from M4[edit]

  • Front end
    • Larger instruction queue (60 entries, up from 48)
    • Improved mispredict penalty (15 cycles, down from 16)
  • Back end
    • LSU execution units reorganized
      • Two new 32b integer ALU pipes
    • Floating-point execution units reorganized

This list is incomplete; you can help by expanding it.

Block Diagram[edit]

Individual Core[edit]

mongoose 5 block diagram.svg

Memory Hierarchy[edit]

  • Cache
    • L1I Caches
      • 64 KiB, 4-way set associative
        • 128 B line size
        • per core
      • Parity-protected
    • L1D Cache
      • 64 KiB, 8-way set associative
        • 64 B line size
        • per core
      • 4 cycles for fastest load-to-use
      • 32 B/cycle load bandwidth
      • 16 B/cycle store bandwidth
    • L2 Cache
      • 512 KiB, 8-way set associative
      • Inclusive of L1
      • 12 cycles latency
      • 32 B/cycle bandwidth
    • L3 Cache
      • 2 MiB, 16-way set associative
        • 1 MiB slice/core
      • Exlusive of L2
      • ~37-cycle typical (NUCA)
    • BIU
      • 80 outstanding transactions

The M3 TLB consists of dedicated L1 TLB for instruction cache (ITLB) and another one for data cache (DTLB). Additionally, there is a unified L2 TLB (STLB).

  • TLBs
    • ITLB
      • 512-entry
    • DTLB
      • 32-entry
      • 512-entry Mid-level DTLB
    • STLB
      • 4,096-entry
      • Per core
  • BPU
    • 4K-entry main BTB
    • 128-entry µBTB
    • 64-entry return stack
    • 16K-entry L2 BTB


Bibliography[edit]

  • LLVM: lib/Target/AArch64/AArch64SchedExynosM5.td