From WikiChip
Difference between revisions of "samsung/microarchitectures/m3"
< samsung

(Key changes from {{\\|Mongoose 1}}/{{\\|Mongoose 2|M2}})
Line 43: Line 43:
 
**** Can fuse literal generation operations
 
**** Can fuse literal generation operations
 
** Back-end
 
** Back-end
*** larger [[ReOrder buffer]] (228 entries, from 96 entries)
+
*** Larger [[ReOrder buffer]] (228 entries, from 96 entries)
*** Has a fastpath logical shift of up to 3 places
+
*** New fastpath logical shift of up to 3 places
<!--
+
*** Larger dispatch window (12 µOP/cycle, from 9)
*** Paired 128 bit loads and stores are no longer slow
+
*** Larger Integer physical register file
-->
+
*** Larger FP physical register
* branch misprediction penalty increased (16, from 14)
+
*** Integer cluster
 +
**** 9 pipes (from 7)
 +
***** New pipe for a second load unit added
 +
***** New pipe for a second ALU with 3-operand support and MUL/DIV
 +
*** Floating Point cluster
 +
**** 3 pipes (From 3)
 +
***** Throughput of most FP operation have increased by 50%
 +
***** Additional EUs
 +
****** crypto EU, simple vector EU, vector shuffle/shift/mul, new FP store, new FP conversion
 +
** Memory subsystem
 +
*** 2x bandwidth (32B (2x16B)/cycle from 16B/cycle)
 +
**** fast paired 128-bit loads and stores
 +
* branch misprediction penalty increased (16 cycles, from 14)
  
 
{{expand list}}
 
{{expand list}}

Revision as of 09:18, 5 February 2018

Edit Values
Mongoose 3 µarch
General Info
Arch TypeCPU
DesignerSamsung
ManufacturerSamsung
Introduction2018
Process10 nm
Core Configs4
Pipeline
OoOEYes
SpeculativeYes
Reg RenamingYes
Decode6-way
Instructions
ISAARMv8
Succession

Mongoose 3 (M3) is an ARM microarchitecture designed by Samsung for their consumer electronics serving as a successor to the Mongoose 2.

Process Technology

The M3 was fabricated on Samsung's second generation 10LPP (Low Power Plus) process.

Compiler support

Compiler Arch-Specific Arch-Favorable
GCC -march=armv8-a+crypto -mtune=exynos-m3

Architecture

Symbol version future.svg Preliminary Data! Information presented in this article deal with future products, data, features, and specifications that have yet to be finalized, announced, or released. Information may be incomplete and can change by final release.

Key changes from Mongoose 1/M2

  • 10nm 10LPP process (from 1st gen 10LPP)
  • Core
    • Front-end
      • larger instruction queue (40 entries, up from 24)
      • 6-way decode (from 4)
      • µOP fusion
        • Can fuse address generation and memory operations
        • Can fuse literal generation operations
    • Back-end
      • Larger ReOrder buffer (228 entries, from 96 entries)
      • New fastpath logical shift of up to 3 places
      • Larger dispatch window (12 µOP/cycle, from 9)
      • Larger Integer physical register file
      • Larger FP physical register
      • Integer cluster
        • 9 pipes (from 7)
          • New pipe for a second load unit added
          • New pipe for a second ALU with 3-operand support and MUL/DIV
      • Floating Point cluster
        • 3 pipes (From 3)
          • Throughput of most FP operation have increased by 50%
          • Additional EUs
            • crypto EU, simple vector EU, vector shuffle/shift/mul, new FP store, new FP conversion
    • Memory subsystem
      • 2x bandwidth (32B (2x16B)/cycle from 16B/cycle)
        • fast paired 128-bit loads and stores
  • branch misprediction penalty increased (16 cycles, from 14)

This list is incomplete; you can help by expanding it.

Memory Hierarchy

New text document.svg This section is empty; you can help add the missing info by editing this page.

Core

New text document.svg This section is empty; you can help add the missing info by editing this page.

All M3 Processors

 List of M3-based Processors
 Main processorIntegrated Graphics
ModelFamilyLaunchedArchCoresFrequencyTurboGPUFrequency
Count: 0


References

  • LLVM: lib/Target/AArch64/AArch64SchedExynosM3.td