From WikiChip
Editing x86/amx
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone.
Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
This page supports semantic in-text annotations (e.g. "[[Is specified as::World Heritage Site]]") to build structured and queryable content provided by Semantic MediaWiki. For a comprehensive description on how to use annotations or the #ask parser function, please have a look at the getting started, in-text annotation, or inline queries help pages.
Latest revision | Your text | ||
Line 1: | Line 1: | ||
{{x86 title|Advanced Matrix Extension (AMX)}}{{x86 isa main}} | {{x86 title|Advanced Matrix Extension (AMX)}}{{x86 isa main}} | ||
− | '''Advanced Matrix Extension''' ('''AMX''') is an [[x86]] | + | '''Advanced Matrix Extension''' ('''AMX''') is an [[x86]] extension that introduces an accelerator framework for operating on matrices. |
== Overview == | == Overview == | ||
− | + | The Advanced Matrix Extension (AMX) is an [[x86]] extension that introduces a new programming framework for working with matrices. AMX introduces two new components - a 2-dimensional register file with registers called 'tiles' and a set of [[accelerators]] that are able to operate on those tiles. The tiles represent a sub-array portion from a large 2-dimensional memory image. AMX instructions synchronous in the instructions stream and memory loads and stores by tiles are coherent with the host's memory accesses. AMX instructions may be freely interleaved with traditional x86 code and parallel with other extensions (e.g., [[AVX512]]) with special tile loads and stores and accelerator commands being sent over to the accelerator for execution. | |
− | The Advanced Matrix Extension (AMX) is an [[x86]] extension that introduces a new programming framework for working with matrices | ||
=== Palettes === | === Palettes === | ||
Line 19: | Line 18: | ||
AMX supports a set of accelerators that can operate on tiles. Currently, just one accelerator is defined. | AMX supports a set of accelerators that can operate on tiles. Currently, just one accelerator is defined. | ||
==== Tile matrix multiply unit (TMUL) ==== | ==== Tile matrix multiply unit (TMUL) ==== | ||
− | The '''Tile Matrix Multiply''' ('''TMUL''') Unit is an accelerator as part of AMX comprising a grid of fused multiply-add units capable of operating on tiles. Its existence is defined by the | + | The '''Tile Matrix Multiply''' ('''TMUL''') Unit is an accelerator as part of AMX comprising a grid of fused multiply-add units capable of operating on tiles. Its existence is defined by the AMX-INT8 and AMX-BF16 sub-extensions. The TMUL unit comes with a number of parameters supported including the maximum height (<code>tmul_maxk</code>) and maximum SIMD dimension (<code>tmul_maxn</code>). Those parameters are dynamically read by the TMUL unit upon execution. |
− | |||
− | The TMUL unit comes with a number of parameters supported including the maximum height (<code>tmul_maxk</code>) and maximum SIMD dimension (<code>tmul_maxn</code>). Those parameters are dynamically read by the TMUL unit upon execution. | ||
== Instructions == | == Instructions == | ||
− | |||
AMX introduces 12 new instructions: | AMX introduces 12 new instructions: | ||
Line 38: | Line 34: | ||
Operation: | Operation: | ||
− | * <code>TDPBF16PS</code> - | + | * <code>TDPBF16PS</code> - Dot product of [[BF16]] tiles, performs a set of SIMD dot-products of two BF16 elements and accumulates the results into a packed single-precision tile. |
− | * <code> | + | * <code>TDPBSSD</code>/<code>TDPBSUD</code>/<code>TDPBUSD</code>/<code>TDPBUUD</code> - Dot product of [[Int8]] tiles, performs a set of SIMD dot-products on two bytes and accumulates the results. ''SU'' = Signed/Unsigned, ''US'' = Unsigned/Signed, ''SS'' = Signed/Signed, and ''UU'' = Unsigned/Unsigned pairs. |
− | |||
=== Feature set === | === Feature set === |