From WikiChip
Difference between revisions of "samsung/microarchitectures/m5"
m (Reverted edits by 92.184.125.116 (talk) to last revision by David) |
m (Reverted edits by 177.101.59.145 (talk) to last revision by 165.225.76.211) |
||
(7 intermediate revisions by 5 users not shown) | |||
Line 2: | Line 2: | ||
{{microarchitecture | {{microarchitecture | ||
|atype=CPU | |atype=CPU | ||
− | |name= | + | |name= Lion M5 |
|designer=Samsung | |designer=Samsung | ||
|manufacturer=Samsung | |manufacturer=Samsung | ||
|introduction=2020 | |introduction=2020 | ||
− | |process=8 | + | |process=7 nm |
+ | |cores=2 | ||
+ | |type=Superscalar | ||
+ | |type 2=Superpipeline | ||
+ | |oooe=Yes | ||
+ | |speculative=Yes | ||
+ | |renaming=Yes | ||
+ | |stages=16 | ||
+ | |decode=6-way | ||
+ | |isa=ARMv8.2 | ||
+ | |l1i=64 KiB | ||
+ | |l1i per=core | ||
+ | |l1i desc=4-way set associative | ||
+ | |l1d=64 KiB | ||
+ | |l1d per=core | ||
+ | |l1d desc=8-way set associative | ||
+ | |l2=512 KiB | ||
+ | |l2 per=core | ||
+ | |l2 desc=8-way set associative | ||
+ | |l3=2 MiB | ||
+ | |l3 per=cluster | ||
+ | |l3 desc=16-way set associative | ||
|predecessor=M4 | |predecessor=M4 | ||
|predecessor link=samsung/microarchitectures/m4 | |predecessor link=samsung/microarchitectures/m4 | ||
− | |||
− | |||
}} | }} | ||
− | '''Exynos | + | '''Exynos M5''' ('''Lion''') is the successor to the {{\\|Mongoose 4}}, a [[7 nm]] [[ARM]] microarchitecture designed by [[Samsung]] for their consumer electronics. |
− | |||
− | |||
− | |||
== Process Technology == | == Process Technology == | ||
− | The M5 is | + | The M5 is fabricated on Samsung's [[7 nm process]] (7LPP). |
== Compiler support == | == Compiler support == | ||
Line 33: | Line 49: | ||
=== Key changes from {{\\|M4}} === | === Key changes from {{\\|M4}} === | ||
− | {{ | + | * Front end |
+ | ** Larger [[instruction queue]] (60 entries, up from 48) | ||
+ | ** Improved mispredict penalty (15 cycles, down from 16) | ||
+ | * Back end | ||
+ | ** LSU execution units reorganized | ||
+ | *** Two new 32b integer ALU pipes | ||
+ | ** Floating-point execution units reorganized | ||
+ | *** Three new dedicted {{arm|NEON}} [[dot product]] EUs | ||
+ | {{expand list}} | ||
+ | |||
+ | === Block Diagram === | ||
+ | ==== Individual Core ==== | ||
+ | |||
+ | [[File:mongoose 5 block diagram.svg|950px]] | ||
+ | |||
+ | === Memory Hierarchy === | ||
+ | * Cache | ||
+ | ** L1I Caches | ||
+ | *** 64 KiB, 4-way set associative | ||
+ | **** 128 B line size | ||
+ | **** per core | ||
+ | *** Parity-protected | ||
+ | ** L1D Cache | ||
+ | *** 64 KiB, 8-way set associative | ||
+ | **** 64 B line size | ||
+ | **** per core | ||
+ | *** 4 cycles for fastest load-to-use | ||
+ | *** 32 B/cycle load bandwidth | ||
+ | *** 16 B/cycle store bandwidth | ||
+ | ** L2 Cache | ||
+ | *** 512 KiB, 8-way set associative | ||
+ | *** Inclusive of L1 | ||
+ | *** 12 cycles latency | ||
+ | *** 32 B/cycle bandwidth | ||
+ | ** L3 Cache | ||
+ | *** 2 MiB, 16-way set associative | ||
+ | **** 1 MiB slice/core | ||
+ | *** Exlusive of L2 | ||
+ | *** ~37-cycle typical (NUCA) | ||
+ | ** BIU | ||
+ | *** 80 outstanding transactions | ||
+ | |||
+ | The M3 TLB consists of dedicated L1 TLB for instruction cache (ITLB) and another one for data cache (DTLB). Additionally, there is a unified L2 TLB (STLB). | ||
+ | |||
+ | * TLBs | ||
+ | ** ITLB | ||
+ | *** 512-entry | ||
+ | ** DTLB | ||
+ | *** 32-entry | ||
+ | *** 512-entry Mid-level DTLB | ||
+ | ** STLB | ||
+ | *** 4,096-entry | ||
+ | *** Per core | ||
+ | |||
+ | * BPU | ||
+ | ** 4K-entry main BTB | ||
+ | ** 128-entry µBTB | ||
+ | ** 64-entry return stack | ||
+ | ** 16K-entry L2 BTB | ||
+ | |||
+ | |||
+ | == Bibliography == | ||
+ | * LLVM: lib/Target/AArch64/AArch64SchedExynosM5.td |
Latest revision as of 21:10, 27 July 2021
Edit Values | |
Lion M5 µarch | |
General Info | |
Arch Type | CPU |
Designer | Samsung |
Manufacturer | Samsung |
Introduction | 2020 |
Process | 7 nm |
Core Configs | 2 |
Pipeline | |
Type | Superscalar, Superpipeline |
OoOE | Yes |
Speculative | Yes |
Reg Renaming | Yes |
Stages | 16 |
Decode | 6-way |
Instructions | |
ISA | ARMv8.2 |
Cache | |
L1I Cache | 64 KiB/core 4-way set associative |
L1D Cache | 64 KiB/core 8-way set associative |
L2 Cache | 512 KiB/core 8-way set associative |
L3 Cache | 2 MiB/cluster 16-way set associative |
Succession | |
Exynos M5 (Lion) is the successor to the Mongoose 4, a 7 nm ARM microarchitecture designed by Samsung for their consumer electronics.
Contents
Process Technology[edit]
The M5 is fabricated on Samsung's 7 nm process (7LPP).
Compiler support[edit]
Compiler | Arch-Specific | Arch-Favorable |
---|---|---|
GCC | -mcpu=exynos-m5 |
-mtune=exynos-m5
|
LLVM | -mcpu=exynos-m5 |
-mtune=exynos-m5
|
Architecture[edit]
Key changes from M4[edit]
- Front end
- Larger instruction queue (60 entries, up from 48)
- Improved mispredict penalty (15 cycles, down from 16)
- Back end
- LSU execution units reorganized
- Two new 32b integer ALU pipes
- Floating-point execution units reorganized
- Three new dedicted NEON dot product EUs
- LSU execution units reorganized
This list is incomplete; you can help by expanding it.
Block Diagram[edit]
Individual Core[edit]
Memory Hierarchy[edit]
- Cache
- L1I Caches
- 64 KiB, 4-way set associative
- 128 B line size
- per core
- Parity-protected
- 64 KiB, 4-way set associative
- L1D Cache
- 64 KiB, 8-way set associative
- 64 B line size
- per core
- 4 cycles for fastest load-to-use
- 32 B/cycle load bandwidth
- 16 B/cycle store bandwidth
- 64 KiB, 8-way set associative
- L2 Cache
- 512 KiB, 8-way set associative
- Inclusive of L1
- 12 cycles latency
- 32 B/cycle bandwidth
- L3 Cache
- 2 MiB, 16-way set associative
- 1 MiB slice/core
- Exlusive of L2
- ~37-cycle typical (NUCA)
- 2 MiB, 16-way set associative
- BIU
- 80 outstanding transactions
- L1I Caches
The M3 TLB consists of dedicated L1 TLB for instruction cache (ITLB) and another one for data cache (DTLB). Additionally, there is a unified L2 TLB (STLB).
- TLBs
- ITLB
- 512-entry
- DTLB
- 32-entry
- 512-entry Mid-level DTLB
- STLB
- 4,096-entry
- Per core
- ITLB
- BPU
- 4K-entry main BTB
- 128-entry µBTB
- 64-entry return stack
- 16K-entry L2 BTB
Bibliography[edit]
- LLVM: lib/Target/AArch64/AArch64SchedExynosM5.td
Facts about "Exynos M5 - Microarchitectures - Samsung"
codename | Lion M5 + |
core count | 2 + |
designer | Samsung + |
first launched | 2020 + |
full page name | samsung/microarchitectures/m5 + |
instance of | microarchitecture + |
instruction set architecture | ARMv8.2 + |
manufacturer | Samsung + |
microarchitecture type | CPU + |
name | Lion M5 + |
pipeline stages | 16 + |
process | 7 nm (0.007 μm, 7.0e-6 mm) + |