From WikiChip
Neoverse N1 - Microarchitectures - ARM
< arm holdings
Revision as of 18:05, 18 July 2021 by 177.101.59.145 (talk)

Edit Values
tuc µarch
General Info
Arch TypeCPU
DesignerARM Holdings
ManufacturerTSMC
IntroductionFebruary 20, 2019
Process7 nm
Core Configs4, 8, 16, 32, 64, 96, 128
Pipeline
TypeSuperscalar, Superpipeline
OoOEYes
SpeculativeYes
Reg RenamingYes
Stages11
Decode4-way
Instructions
ISAARMv8.2
Cache
L1I Cache64 KiB/core
4-way set associative
L1D Cache64 KiB/core
4-way set associative
L2 Cache512-1 MiB/core
8-way set associative
L3 Cache2-4 MiB/core duplex
16-way set associative
Succession

Neoverse N1 (codename Ares) is a high-performance ARM microarchitecture designed by ARM Holdings for the server market. This microarchitecture is designed as a synthesizable IP core and is sold to other semiconductor companies to be implemented in their own chips.

History

Arm's server roadmap.

The Neoverse N1, formerly Ares, is the first Arm design to specifically target the infrastructure market, serving as the successor to the Cosmos platform which used the same cores as the client platform. The N1 was first announced by Drew Henry, Arm’s SVP and GM of Infrastructure Business Unit, at his TechCon 2018 keynote. Ares was officially unveiled on February 20, 2019.

Release Dates

Ares was officially disclosed on February 20, 2019.

Process Technology

Ares specifically takes advantage of the power and area advantages of the 7 nm process.

Compiler Support

Compiler Arch-Specific Arch-Favorable Arch-Target
GCC -march=armv8-2a -mtune=neoverse-n1 -mcpu=neoverse-n1
LLVM -march=armv8-2a -mtune=neoverse-n1 -mcpu=neoverse-n1

Architecture

The Neoverse N1 core is almost identical to the Cortex-A76 but features a number of enhancements for infrastructure workload.

Block Diagram

Typical SoC

neoverse n1 soc block diagram.svg


The Neoverse N1 is also expected to be integrated along with Neoverse E1 high-efficiency cores and possibly other custom IP blocks.


neoverse e1 n1 soc example.svg

Individual Core

neoverse n1 block diagram.svg

Memory Hierarchy

The Neoverse N1 has a private L1I, L1D, and L2 cache.

  • Cache
    • L1I Cache
      • 64 KiB, 4-way set associative
      • 64-byte cache lines
      • SECDED ECC
      • Write-back
    • L1D Cache
      • 64 KiB, 4-way set associative
      • 64-byte cache lines
      • 4-cycle fastest load-to-use latency
      • SECDED ECC
      • Write-back
    • L2 Cache
      • 512 KiB OR 1 MiB (2 banks)
      • 8-way set associative
      • 9-11 cycle
        • 9-cycle fastest load-to-use latency
      • ECC protection per 64 bits
      • Modified Exclusive Shared Invalid (MESI) coherency
      • Strictly inclusive of the L1 data cache & non-inclusive of the L1 instruction cache
      • Write-back
    • System-level cache (SLC)
      • 1 Bank per core duplex
      • 2 MiB to 4 MiB, 16-way set associative

The Neoverse N1 TLB consists of dedicated L1 TLB for instruction cache (ITLB) and another one for data cache (DTLB). Additionally, there is a unified L2 TLB (STLB).

  • TLBs
    • ITLB
      • 4 KiB, 16 KiB, 64 KiB, 2 MiB, and 32 MiB page sizes
      • 48-entry fully associative
    • DTLB
      • 48-entry fully associative
      • 4 KiB, 16 KiB, 64 KiB, 2 MiB, and 512 MiB page sizes
    • STLB
      • 1280-entry 5-way set associative

Overview

Neoverse N1 Typical SoC

Formerly known as Ares, the Neoverse N1 is the first ground-up Arm microarchitecture design that targets infrastructure, targetting a wide range of markets from the edge to hyperscalers data centers. Departing from Arm's low-power mobile cores, the N1 targets high-performance server workloads at higher TDPs and higher compute power. Compared to the prior Cosmos platform, the Neoverse N1 is said to deliver a significant uplift in single-thread performance.

The Neoverse N1 is designed to enable Arm partners rapid development of high-performance server products. The N1 features an 11-stage out-of-order core with private L1 and L2 caches. The core is intended to leverage Arm's Coherent Mesh Network 600 (CMN-600) interconnect to scale from as little as a quad-core design to as much as 128 cores and from a single DDR channel all the way up to eight channels, depending on the kind of workload being addressed. Extending the base design is a framework for multiprocessing support as well as chiplets support which can be used by companies who are looking to improve yield and manufacturability with large SoC designs. The N1 is also designed to work seamlessly with the Neoverse E1 which was introduced at the same time as N1 but is optimized for high throughput multithreaded workloads.

Core

The Neoverse N1 features an 11-stage accordion integer pipeline.


neoverse n1 pipeline.svg

Die

N1 core

  • 7 nm process
  • 1 Core + L2
  • 1.2 mm² die size (1C + 512 KiB L2)
  • 1.4 mm² die size (1C + 1 MiB L2)
  • 1 W @ 2.6 GHz (0.75 V), 1.8 W @ 3.1 GHz (1.0 V)


neoverse n1 core die plot.png

All Neoverse N1 Processors

 List of Neoverse N1-based Processors
ModelFamilyLaunchedProcessArchCoresFrequency
ALC12B00Graviton3 December 20197 nm
0.007 μm
7.0e-6 mm
Neoverse N1642.5 GHz
2,500 MHz
2,500,000 kHz
Count: 1

Bibliography

  • Drew Henry keynote, TechCon 2018 keynote.
  • Drew Henry, direct communication
  • Most of the technical details were obtained directly from Arm

Documents

codenametuc +
core count4 +, 8 +, 16 +, 32 +, 64 +, 96 + and 128 +
designerARM Holdings +
first launchedFebruary 20, 2019 +
full page namearm holdings/microarchitectures/neoverse n1 +
instance ofmicroarchitecture +
instruction set architectureARMv8.2 +
manufacturerTSMC +
microarchitecture typeCPU +
nametuc +
pipeline stages11 +
process7 nm (0.007 μm, 7.0e-6 mm) +