(→History) |
(Corrected compiler arch flags) |
||
(7 intermediate revisions by 4 users not shown) | |||
Line 50: | Line 50: | ||
== Process Technology == | == Process Technology == | ||
Ares specifically takes advantage of the power and area advantages of the [[7 nm process]]. | Ares specifically takes advantage of the power and area advantages of the [[7 nm process]]. | ||
+ | |||
+ | == Compiler Support == | ||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! Compiler !! Arch-Specific || Arch-Favorable || Arch-Target | ||
+ | |- | ||
+ | | [[GCC]] || <code>-march=armv8.2-a</code> || <code>-mtune=neoverse-n1</code> || <code>-mcpu=neoverse-n1</code> | ||
+ | |- | ||
+ | | [[LLVM]] || <code>-march=armv8.2-a</code> || <code>-mtune=neoverse-n1</code> || <code>-mcpu=neoverse-n1</code> | ||
+ | |} | ||
== Architecture == | == Architecture == | ||
The Neoverse N1 core is almost identical to the {{\\|Cortex-A76}} but features a number of enhancements for infrastructure workload. | The Neoverse N1 core is almost identical to the {{\\|Cortex-A76}} but features a number of enhancements for infrastructure workload. | ||
+ | * [[ARMv8.2]] | ||
* [[7 nm process]] | * [[7 nm process]] | ||
* Core | * Core | ||
Line 173: | Line 184: | ||
* Drew Henry, direct communication | * Drew Henry, direct communication | ||
* Most of the technical details were obtained directly from Arm | * Most of the technical details were obtained directly from Arm | ||
+ | |||
+ | == Documents == | ||
+ | * [[:File:arm neoverse n1 sog.pdf|Neoverse N1 Software Optimization Guide]] | ||
+ | * [[:File:arm neoverse n1 trm.pdf|Neoverse N1 Technical Reference Manual]] |
Latest revision as of 12:46, 18 February 2023
Edit Values | |
Neoverse N1 µarch | |
General Info | |
Arch Type | CPU |
Designer | ARM Holdings |
Manufacturer | TSMC |
Introduction | February 20, 2019 |
Process | 7 nm |
Core Configs | 4, 8, 16, 32, 64, 96, 128 |
Pipeline | |
Type | Superscalar, Superpipeline |
OoOE | Yes |
Speculative | Yes |
Reg Renaming | Yes |
Stages | 11 |
Decode | 4-way |
Instructions | |
ISA | ARMv8.2 |
Cache | |
L1I Cache | 64 KiB/core 4-way set associative |
L1D Cache | 64 KiB/core 4-way set associative |
L2 Cache | 512-1 MiB/core 8-way set associative |
L3 Cache | 2-4 MiB/core duplex 16-way set associative |
Succession | |
Neoverse N1 (codename Ares) is a high-performance ARM microarchitecture designed by ARM Holdings for the server market. This microarchitecture is designed as a synthesizable IP core and is sold to other semiconductor companies to be implemented in their own chips.
Contents
History[edit]
The Neoverse N1, formerly Ares, is the first Arm design to specifically target the infrastructure market, serving as the successor to the Cosmos platform which used the same cores as the client platform. The N1 was first announced by Drew Henry, Arm’s SVP and GM of Infrastructure Business Unit, at his TechCon 2018 keynote. Ares was officially unveiled on February 20, 2019.
Release Dates[edit]
Ares was officially disclosed on February 20, 2019.
Process Technology[edit]
Ares specifically takes advantage of the power and area advantages of the 7 nm process.
Compiler Support[edit]
Compiler | Arch-Specific | Arch-Favorable | Arch-Target |
---|---|---|---|
GCC | -march=armv8.2-a |
-mtune=neoverse-n1 |
-mcpu=neoverse-n1
|
LLVM | -march=armv8.2-a |
-mtune=neoverse-n1 |
-mcpu=neoverse-n1
|
Architecture[edit]
The Neoverse N1 core is almost identical to the Cortex-A76 but features a number of enhancements for infrastructure workload.
- ARMv8.2
- 7 nm process
- Core
- 11-stage
- 4-way decode
- 8-way issue
- System architecture
- Designed for the Coherent Mesh Network 600 (CMN-600) mesh interconnect
Block Diagram[edit]
Typical SoC[edit]
The Neoverse N1 is also expected to be integrated along with Neoverse E1 high-efficiency cores and possibly other custom IP blocks.
Individual Core[edit]
Memory Hierarchy[edit]
The Neoverse N1 has a private L1I, L1D, and L2 cache.
- Cache
- L1I Cache
- 64 KiB, 4-way set associative
- 64-byte cache lines
- SECDED ECC
- Write-back
- L1D Cache
- 64 KiB, 4-way set associative
- 64-byte cache lines
- 4-cycle fastest load-to-use latency
- SECDED ECC
- Write-back
- L2 Cache
- 512 KiB OR 1 MiB (2 banks)
- 8-way set associative
- 9-11 cycle
- 9-cycle fastest load-to-use latency
- ECC protection per 64 bits
- Modified Exclusive Shared Invalid (MESI) coherency
- Strictly inclusive of the L1 data cache & non-inclusive of the L1 instruction cache
- Write-back
- System-level cache (SLC)
- 1 Bank per core duplex
- 2 MiB to 4 MiB, 16-way set associative
- L1I Cache
The Neoverse N1 TLB consists of dedicated L1 TLB for instruction cache (ITLB) and another one for data cache (DTLB). Additionally, there is a unified L2 TLB (STLB).
- TLBs
- ITLB
- 4 KiB, 16 KiB, 64 KiB, 2 MiB, and 32 MiB page sizes
- 48-entry fully associative
- DTLB
- 48-entry fully associative
- 4 KiB, 16 KiB, 64 KiB, 2 MiB, and 512 MiB page sizes
- STLB
- 1280-entry 5-way set associative
- ITLB
Overview[edit]
Formerly known as Ares, the Neoverse N1 is the first ground-up Arm microarchitecture design that targets infrastructure, targetting a wide range of markets from the edge to hyperscalers data centers. Departing from Arm's low-power mobile cores, the N1 targets high-performance server workloads at higher TDPs and higher compute power. Compared to the prior Cosmos platform, the Neoverse N1 is said to deliver a significant uplift in single-thread performance.
The Neoverse N1 is designed to enable Arm partners rapid development of high-performance server products. The N1 features an 11-stage out-of-order core with private L1 and L2 caches. The core is intended to leverage Arm's Coherent Mesh Network 600 (CMN-600) interconnect to scale from as little as a quad-core design to as much as 128 cores and from a single DDR channel all the way up to eight channels, depending on the kind of workload being addressed. Extending the base design is a framework for multiprocessing support as well as chiplets support which can be used by companies who are looking to improve yield and manufacturability with large SoC designs. The N1 is also designed to work seamlessly with the Neoverse E1 which was introduced at the same time as N1 but is optimized for high throughput multithreaded workloads.
Core[edit]
The Neoverse N1 features an 11-stage accordion integer pipeline.
Die[edit]
N1 core[edit]
- 7 nm process
- 1 Core + L2
- 1.2 mm² die size (1C + 512 KiB L2)
- 1.4 mm² die size (1C + 1 MiB L2)
- 1 W @ 2.6 GHz (0.75 V), 1.8 W @ 3.1 GHz (1.0 V)
All Neoverse N1 Processors[edit]
List of Neoverse N1-based Processors | ||||||||
---|---|---|---|---|---|---|---|---|
Model | Family | Launched | Process | Arch | Cores | Frequency | ||
ALC12B00 | Graviton | 3 December 2019 | 7 nm 0.007 μm 7.0e-6 mm | Neoverse N1 | 64 | 2.5 GHz 2,500 MHz 2,500,000 kHz | ||
Count: 1 |
Bibliography[edit]
- Drew Henry keynote, TechCon 2018 keynote.
- Drew Henry, direct communication
- Most of the technical details were obtained directly from Arm
Documents[edit]
codename | Neoverse N1 + |
core count | 4 +, 8 +, 16 +, 32 +, 64 +, 96 + and 128 + |
designer | ARM Holdings + |
first launched | February 20, 2019 + |
full page name | arm holdings/microarchitectures/neoverse n1 + |
instance of | microarchitecture + |
instruction set architecture | ARMv8.2 + |
manufacturer | TSMC + |
microarchitecture type | CPU + |
name | Neoverse N1 + |
pipeline stages | 11 + |
process | 7 nm (0.007 μm, 7.0e-6 mm) + |