From WikiChip
Difference between revisions of "arm holdings/microarchitectures/cortex-a510"
< arm holdings

(Memory Hierarchy)
(Memory Hierarchy)
Line 102: Line 102:
 
**** Slice includes: data RAMs, L2 tags, L2 replacement RAM, and L1 duplicate tag RAMs
 
**** Slice includes: data RAMs, L2 tags, L2 replacement RAM, and L1 duplicate tag RAMs
 
**** Slice can be configured as single/dual partitions for up to two concurrent accesses to different L2 ways
 
**** Slice can be configured as single/dual partitions for up to two concurrent accesses to different L2 ways
 +
  
 
The Cortex-A510 features an instruction TLB (ITLB) and data TLB (DTLB) which are private to each core and an L2 TLB that is private to the core complex.
 
The Cortex-A510 features an instruction TLB (ITLB) and data TLB (DTLB) which are private to each core and an L2 TLB that is private to the core complex.

Revision as of 23:43, 21 August 2021

Edit Values
Cortex-A510 µarch
General Info
Arch TypeCPU
DesignerARM Holdings
ManufacturerTSMC, Samsung, GlobalFoundries, SMIC
IntroductionMay 25, 2021
Process7 nm, 6 nm, 5 nm
Core Configs1, 2
Pipeline
TypeIn-order
OoOENo
SpeculativeYes
Reg RenamingNo
Decode3-way
Instructions
ISAARMv9.0
ExtensionsFPU, NEON, SVE, SVE2, TrustZone
Cache
L1I Cache32-64 KiB/core
4-way set associative
L1D Cache32-64 KiB/core
4-way set associative
L2 Cache0-512 KiB/cluster
4-way set associative
Succession

Cortex-A510 is an ultra-high efficiency microarchitecture designed by ARM Holdings as a successor to the Cortex-A55. The Cortex-A510, which implements the ARMv9.0 ISA, is typically found in smartphone and other embedded devices. Often A510 cores are combined with higher performance processors (e.g. based on Cortex-A710) in DynamIQ big.LITTLE configuration to achieve better energy/performance.

Note that this microarchitecture is designed as a synthesizable IP core and is sold to other semiconductor companies to be implemented in their own chips.

Process Technology

The Cortex-A510 was primarily designed to take advantage of TSMC's 7 nm, 6 nm, 5 nm as well as Samsung's 7 nm and 5 nm.

Architecture

The Cortex-A510 is a brand new ground-up CPU design. It borrows advanced processor components from Arm's high-performance cores - such as the branch prediction and prefetchers - to extract high performance from a traditional in-order core design. The Cortex-A512 is also designed to seamlessly integrate along with higher-performance cores through Arm's DynamIQ big.LITTLE technology.

Key changes from Cortex-A55

  • Brand new ground-up design
    • Higher performance (Arm claims: +35% IPC (SPECint 2006) / +50% IPC (SPECfp 2006)
    • Lower power (Arm claims: -20% energy @ iso-performance / +10% performance @ iso-power)
    • Core Complex with a merged core architecture
      • Two independent cores
      • One shared vector unit
        • Configurable 64b or 128b pipe
    • Front-End
      • Wider fetch (128b/cycle, up from 64b)
      • Wider decoder (3-way, up from 2-way)
      • New branch predictors
      • New prefetchers
    • Back-End
      • In-order
    • Memory Subsystem
      • Larger L1 (32-64 KiB)
      • Larger L2 (0-512 KiB, up from 0-256 KiB)
        • 2x bandwidth L2->L3
      • Wider loads (128b/cycle, up from 64b/cycle)
      • 2x loads (2 lds/cycle, up from 1/cycle)
    • New ISA Support
      • ARMv9.0 ISA
      • SVE, SVE2 support

Block Diagram

Core Complex

Individual Core

Memory Hierarchy

The Cortex-A510 has a private L1I, L1D, and cluster-wide L2 cache.

  • Cache
    • L1I Cache
      • Private to core
      • 32 KiB OR 64 KiB, 4-way set associative
      • 64-byte cache lines
      • Virtually-indexed, physically-tagged (VIPT) behaving as physically-indexed, physically-tagged (PIPT)
      • Single Error Detect (SED) parity cache protection
      • Pseudo-random cache replacement policy
    • L1D Cache
      • Private to core
      • 32KB or 64KB, 4-way set associative
      • Virtually-Indexed, Physically-Tagged (VIPT) behaving as Physically-Indexed, Physically-Tagged (PIPT)
      • Error Correcting Code (ECC) cache protection
      • 64-byte cache lines
      • Pseudo-random cache replacement policy
    • L2 Cache
      • Private to complex
      • 128 KiB OR 192 KiB OR 256 KiB OR 384 KiB OR 512 KiB, 8-way set associative
      • 64-byte cache lines
      • Can be configured as 1-2 slices
        • Slice includes: data RAMs, L2 tags, L2 replacement RAM, and L1 duplicate tag RAMs
        • Slice can be configured as single/dual partitions for up to two concurrent accesses to different L2 ways


The Cortex-A510 features an instruction TLB (ITLB) and data TLB (DTLB) which are private to each core and an L2 TLB that is private to the core complex.

  • TLBs
    • ITLB
      • 16-entries
      • fully associative
      • TLB hits return the PA to the instruction cache
    • DTLB
      • 16-entries
      • fully associative
      • TLB hits return the PA to the data cache
    • L2 TLB
      • 8-way set associative
      • Shared by both cores in the complex

Overview

Core Complex

Core

All Cortex-A510 Processors

Bibliography

  • Arm Tech Day, 2021
codenameCortex-A510 +
core count1 + and 2 +
designerARM Holdings +
first launchedMay 25, 2021 +
full page namearm holdings/microarchitectures/cortex-a510 +
instance ofmicroarchitecture +
instruction set architectureARMv9.0 +
manufacturerTSMC +, Samsung +, GlobalFoundries + and SMIC +
microarchitecture typeCPU +
nameCortex-A510 +
process7 nm (0.007 μm, 7.0e-6 mm) +, 6 nm (0.006 μm, 6.0e-6 mm) + and 5 nm (0.005 μm, 5.0e-6 mm) +