Edit Values | |
Ethos is a family of synthesizable neural processor IPs designed by Arm for IoT and edge applications.
Overview
First introduced in late 2019, Ethos is a family of synthesizable neural processor IPs designed by Arm for various markets. The Ethos family represents the series of NPUs as part of Project Trillium. The underlying microarchitecture for all the Ethos NPUs is the MLP which is designed to be scalable through various configurations based on the SRAM sizes and the number of compute engines.
Members
N-Series
The Ethos-N series was introduced in October 2019. These are fully independent mainstream mobile and IoT NPUs which can be integrated into any SoC just like the Cortex family. These NPUs designed to scale from 1-4 TOPS and from 250 mW up to 1.5 W. Multiple instances of those IPs may be combined using Arm's CCN-500 or CMN-600 interconnects to scale to higher performance.
General | Configuration | SRAM | Compute | ||||||
---|---|---|---|---|---|---|---|---|---|
NPU | Introduction | Quads | CEs | Bank | Total | MACs | OPS/clk | Int8 | Int16 |
Ethos-N37 | Oct 23, 2019 | 1 | 4 | 128 KiB | 512 KiB | 512 MACs | 1024 OPs/clk | 1.024 TOPS | 256 GOPS |
Ethos-N57 | Oct 23, 2019 | 2 | 8 | 64 KiB | 512 KiB | 1024 MACs | 2048 OPs/clk | 2.048 TOPS | 512 GOPS |
Ethos-N77 | Oct 23, 2019 | 4 | 16 | 64 KiB 256 KiB |
1 MiB 4 MiB |
2048 MACs | 4096 OPs/clk | 4.096 TOPS | 1.024 TOPS |
U-Series
The Ethos-U series was introduced in early 2020. This series targets deeply-embedded AI applications. The 'microNPUs' in this series are not complete NPUs like the Ethos-N series. Instead, they feature a much slimmer design. The Ethos-U series is designed to tightly work with a companion Cortex-M processor such as the Cortex-M55 (The M55 is preferred due to its Helium extension support but but other cores such as M7, M4, and the M33 should do fine). Conceptually, the U series can be thought of as a single-compute engine (CE) design whereby the PLE is removed and instead, relies on using a companion Cortex-M core for the extra processing. Due to the power and area constraints, the dedicated SRAM banks are also removed and instead rely on using the shared SoC (or ideally, the Cortex-M cache) for the weights and activations.
General | Configuration | Compute | |||||
---|---|---|---|---|---|---|---|
NPU | Introduction | SRAM | MACs | OPS/clk | Performance (Int8) | ||
Ethos-U55 | February 10 | Shared with Cortex | 32-256 | 64-512 OPs/clk | 25.6-204.8 GOPS (@ 100-400 MHz, typical on 50/40 nm) 64-512 GOPS (@ 1 GHz, typical on 16/7/5 nm) |
See also
- Intel Habana HL Series