From WikiChip
Difference between revisions of "x86/amx"
< x86

(AMX)
 
Line 1: Line 1:
 
{{x86 title|Advanced Matrix Extension (AMX)}}{{x86 isa main}}
 
{{x86 title|Advanced Matrix Extension (AMX)}}{{x86 isa main}}
'''Advanced Matrix Extension''' ('''AMX''') is an [[x86]] extension that introduces an asynchronous accelerator framework for operating on matrices.
+
'''Advanced Matrix Extension''' ('''AMX''') is an [[x86]] extension that introduces an accelerator framework for operating on matrices.
  
 
== Overview ==
 
== Overview ==
The Advanced Matrix Extension (AMX) is an [[x86]] extension that introduces a new programming framework for working with matrices. AMX introduces two new components - a 2-dimensional register file called 'Tiles' and a set of [[accelerators]] that are able to operate on those tiles.
+
The Advanced Matrix Extension (AMX) is an [[x86]] extension that introduces a new programming framework for working with matrices. AMX introduces two new components - a 2-dimensional register file with registers called 'tiles' and a set of [[accelerators]] that are able to operate on those tiles. The tiles represent a sub-array portion from a large 2-dimensional memory image. AMX instructions synchronous in the instructions stream and memory loads and stores by tiles are coherent with the host's memory accesses. AMX instructions may be freely interleaved with traditional x86 code and parallel with other extensions (e.g., [[AVX512]]) with special tile loads and stores and accelerator commands being sent over to the accelerator for execution.
 +
 
 +
=== Palettes ===
 +
Determining the kind of operations available on specific hardware can be done by enumerating a palette of options.
 +
 
 +
Currently, 2 palettes exist:
 +
 
 +
* Palette 0 - initialized state
 +
* Palette 1 - an 8-tile register file with each register being 16 rows x 64-byte (1 KiB) for a total register file of 8 KiB.
 +
 
 +
A programmer can configure the size of the register file by configuring tiles of smaller dimensions to suit their algorithm. Tiles may be configured in rows and bytes_per_row which are stored as metadata for the accelerator to operate on. Information pertaining to the palette is stored in a tile control register (TILECFG) and is accessible via the palette_table CPUID leaf 1DH. The TILECFG is programmed using the <code>LDTILECFG </code> instruction.
 +
 
 +
=== Accelerators ===
 +
AMX supports a set of accelerators that can operate on tiles. Currently, just one accelerator is defined.
 +
==== Tile matrix multiply unit (TMUL) ====
 +
{{empty section}}
 +
 
 +
== Microarchitecture support ==
 +
{| class="wikitable"
 +
|-
 +
! Instructions !! Introduction
 +
|-
 +
| AMX || {{intel|Sapphire Rappids|l=arch}} (server)
 +
|}
 +
 
 +
== Intrinsic functions ==
 +
<source lang=asm>
 +
</source>
 +
 
 +
== Bibliography ==
 +
* ''Intel Architecture Instruction Set Extensions and Future Features Programming Reference'', Revision 40. (Ref #319433-040)
 +
 
 +
[[Category:x86_extensions]]

Revision as of 20:37, 27 June 2020

Advanced Matrix Extension (AMX) is an x86 extension that introduces an accelerator framework for operating on matrices.

Overview

The Advanced Matrix Extension (AMX) is an x86 extension that introduces a new programming framework for working with matrices. AMX introduces two new components - a 2-dimensional register file with registers called 'tiles' and a set of accelerators that are able to operate on those tiles. The tiles represent a sub-array portion from a large 2-dimensional memory image. AMX instructions synchronous in the instructions stream and memory loads and stores by tiles are coherent with the host's memory accesses. AMX instructions may be freely interleaved with traditional x86 code and parallel with other extensions (e.g., AVX512) with special tile loads and stores and accelerator commands being sent over to the accelerator for execution.

Palettes

Determining the kind of operations available on specific hardware can be done by enumerating a palette of options.

Currently, 2 palettes exist:

  • Palette 0 - initialized state
  • Palette 1 - an 8-tile register file with each register being 16 rows x 64-byte (1 KiB) for a total register file of 8 KiB.

A programmer can configure the size of the register file by configuring tiles of smaller dimensions to suit their algorithm. Tiles may be configured in rows and bytes_per_row which are stored as metadata for the accelerator to operate on. Information pertaining to the palette is stored in a tile control register (TILECFG) and is accessible via the palette_table CPUID leaf 1DH. The TILECFG is programmed using the LDTILECFG instruction.

Accelerators

AMX supports a set of accelerators that can operate on tiles. Currently, just one accelerator is defined.

Tile matrix multiply unit (TMUL)

New text document.svg This section is empty; you can help add the missing info by editing this page.

Microarchitecture support

Instructions Introduction
AMX Sapphire Rappids (server)

Intrinsic functions

Bibliography

  • Intel Architecture Instruction Set Extensions and Future Features Programming Reference, Revision 40. (Ref #319433-040)