From WikiChip
Difference between revisions of "arm holdings/microarchitectures/cortex-a510"
< arm holdings

(Memory Hierarchy)
(Block Diagram)
Line 75: Line 75:
 
=== Block Diagram ===
 
=== Block Diagram ===
 
==== Core Complex ====
 
==== Core Complex ====
==== Individual Core ====
+
:[[File:cortex-a510 block diagram.svg|1000px]]
  
 
=== Memory Hierarchy ===
 
=== Memory Hierarchy ===

Revision as of 00:50, 22 August 2021

Edit Values
Cortex-A510 µarch
General Info
Arch TypeCPU
DesignerARM Holdings
ManufacturerTSMC, Samsung, GlobalFoundries, SMIC
IntroductionMay 25, 2021
Process7 nm, 6 nm, 5 nm
Core Configs1, 2
Pipeline
TypeIn-order
OoOENo
SpeculativeYes
Reg RenamingNo
Decode3-way
Instructions
ISAARMv9.0
ExtensionsFPU, NEON, SVE, SVE2, TrustZone
Cache
L1I Cache32-64 KiB/core
4-way set associative
L1D Cache32-64 KiB/core
4-way set associative
L2 Cache0-512 KiB/cluster
4-way set associative
Succession

Cortex-A510 is an ultra-high efficiency microarchitecture designed by ARM Holdings as a successor to the Cortex-A55. The Cortex-A510, which implements the ARMv9.0 ISA, is typically found in smartphone and other embedded devices. Often A510 cores are combined with higher performance processors (e.g. based on Cortex-A710) in DynamIQ big.LITTLE configuration to achieve better energy/performance.

Note that this microarchitecture is designed as a synthesizable IP core and is sold to other semiconductor companies to be implemented in their own chips.

Process Technology

The Cortex-A510 was primarily designed to take advantage of TSMC's 7 nm, 6 nm, 5 nm as well as Samsung's 7 nm and 5 nm.

Architecture

The Cortex-A510 is a brand new ground-up CPU design. It borrows advanced processor components from Arm's high-performance cores - such as the branch prediction and prefetchers - to extract high performance from a traditional in-order core design. The Cortex-A512 is also designed to seamlessly integrate along with higher-performance cores through Arm's DynamIQ big.LITTLE technology.

Key changes from Cortex-A55

  • Brand new ground-up design
    • Higher performance (Arm claims: +35% IPC (SPECint 2006) / +50% IPC (SPECfp 2006)
    • Lower power (Arm claims: -20% energy @ iso-performance / +10% performance @ iso-power)
    • Core Complex with a merged core architecture
      • Two independent cores
      • One shared vector unit
        • Configurable 64b or 128b pipe
    • Front-End
      • Wider fetch (128b/cycle, up from 64b)
      • Wider decoder (3-way, up from 2-way)
      • New branch predictors
      • New prefetchers
    • Back-End
      • In-order
    • Memory Subsystem
      • Larger L1 (32-64 KiB)
      • Larger L2 (0-512 KiB, up from 0-256 KiB)
        • 2x bandwidth L2->L3
      • Wider loads (128b/cycle, up from 64b/cycle)
      • 2x loads (2 lds/cycle, up from 1/cycle)
    • New ISA Support
      • ARMv9.0 ISA
      • SVE, SVE2 support

Block Diagram

Core Complex

cortex-a510 block diagram.svg

Memory Hierarchy

The Cortex-A510 has a private L1I, L1D, and cluster-wide L2 cache.

  • Cache
    • L1I Cache
      • Private to core
      • 32 KiB OR 64 KiB, 4-way set associative
      • 64-byte cache lines
      • Virtually-indexed, physically-tagged (VIPT) behaving as physically-indexed, physically-tagged (PIPT)
      • Single Error Detect (SED) parity cache protection
      • Pseudo-random cache replacement policy
    • L1D Cache
      • Private to core
      • 32KB or 64KB, 4-way set associative
      • Virtually-Indexed, Physically-Tagged (VIPT) behaving as Physically-Indexed, Physically-Tagged (PIPT)
      • Error Correcting Code (ECC) cache protection
      • 64-byte cache lines
      • Pseudo-random cache replacement policy
    • L2 Cache
      • Private to complex
      • 128 KiB OR 192 KiB OR 256 KiB OR 384 KiB OR 512 KiB, 8-way set associative
      • 64-byte cache lines
      • Can be configured as 1-2 slices
        • Slice includes: data RAMs, L2 tags, L2 replacement RAM, and L1 duplicate tag RAMs
        • Slice can be configured as single/dual partitions for up to two concurrent accesses to different L2 ways


The Cortex-A510 features an instruction TLB (ITLB) and data TLB (DTLB) which are private to each core and an L2 TLB that is private to the core complex.

  • TLBs
    • ITLB
      • 16-entries
      • fully associative
      • TLB hits return the PA to the instruction cache
    • DTLB
      • 16-entries
      • fully associative
      • TLB hits return the PA to the data cache
    • L2 TLB
      • 8-way set associative
      • Shared by both cores in the complex

Overview

Core Complex

Core

All Cortex-A510 Processors

Bibliography

  • Arm Tech Day, 2021
codenameCortex-A510 +
core count1 + and 2 +
designerARM Holdings +
first launchedMay 25, 2021 +
full page namearm holdings/microarchitectures/cortex-a510 +
instance ofmicroarchitecture +
instruction set architectureARMv9.0 +
manufacturerTSMC +, Samsung +, GlobalFoundries + and SMIC +
microarchitecture typeCPU +
nameCortex-A510 +
process7 nm (0.007 μm, 7.0e-6 mm) +, 6 nm (0.006 μm, 6.0e-6 mm) + and 5 nm (0.005 μm, 5.0e-6 mm) +