Difference between revisions of "habana/microarchitectures/goya"

	Edit Values
	Goya µarch
	General Info
Arch Type	NPU
Designer	Habana
Manufacturer	TSMC
Introduction	2018
Process	16 nm
PE Configs	8
	Contemporary
	Gaudi

Revision as of 11:18, 28 December 2019

Goya Logo

Goya is a 16-nanometer microarchitecture for inference neural processors designed by Habana Labs.

Process Technology

Goya-based processors are fabricated on TSMC 16-nanometer process.

Architecture

Block Diagram

Overview

Goya is designed as a microarchitecture for the acceleration of inference. Since the target market is the data center, the thermal design point for those chips was relatively high - at around 200 W. The design uses a heterogenous approach comprising of a large General Matrix Multiply (GMM) engine, Tensor Processor Cores (TPCs), and a large shared memory pool.

There are eight TPCs. Each TPC also incorporates its own local memory but omits caches. Each core is a VLIW DSP design that has been optimized for AI applications. This includes AI-specific instructions and operations. The TPCs are designed for flexibility and can be programmed in plain C. The TPC supports mixed-prevision operations including 8-bit, 16-bit, and 32-bit SIMD vector operations for both integer and floating-point. This was done in order to allow accuracy loss tolerance to be controlled on a per-model design by the programmer. Goya offers both coarse-grained precision control and fine-grained down to the tensor level.

Scalability

This section is empty; you can help add the missing info by editing this page.

@@ Line 22: / Line 22: @@
 == Overview ==
-{{empty section}}
+Goya is designed as a microarchitecture for the [[acceleration]] of inference. Since the target market is the data center, the [[thermal design point]] for those chips was relatively high - at around 200 W. The design uses a heterogenous approach comprising of a large General Matrix Multiply (GMM) engine, Tensor
+Processor Cores (TPCs), and a large shared memory pool.
+There are eight TPCs. Each TPC also incorporates its own local memory but omits caches.  Each core is a [[VLIW]] DSP design that has been optimized for AI applications. This includes [[AI]]-specific [[instructions]] and operations. The TPCs are designed for flexibility and can be programmed in plain [[C]]. The TPC supports mixed-prevision operations including 8-bit, 16-bit, and 32-bit SIMD vector operations for both [[integer]] and [[floating-point]]. This was done in order to allow accuracy loss tolerance to be controlled on a per-model design by the programmer. Goya offers both coarse-grained precision control and fine-grained down to the tensor level.
 == Scalability ==
 {{empty section}}

codename	Goya +
designer	Habana +
first launched	2018 +
full page name	habana/microarchitectures/goya +
instance of	microarchitecture +
manufacturer	TSMC +
name	Goya +
process	16 nm (0.016 μm, 1.6e-5 mm) +
processing element count	8 +

WikiChip

The Fuse Coverage

Social Media

Companies

Microarchitectures

Technology Nodes

Intel

AMD

ARM

Cavium

Samsung

Intel

AMD

Ampere

Apple

Cavium

HiSilicon

MediaTek

NXP

Qualcomm

Renesas