Difference between revisions of "x86/avx-512"

Revision as of 22:50, 9 July 2017

x86
Instruction Set Architecture

AVX-512 is collective name for a number of 512-bit SIMD x86 instruction set extensions. The extensions was formally introduced by Intel in July 2013 with first general-purpose microprocessors implementing the extension introduced in July 2017.

Overview

AVX-512 is a set of 512-bit SIMD extensions that allow programs to pack sixteen single-precision eight double-precision floating-point numbers, or eight 64-bit or sixteen 32-bit integers within 512-bit vectors. The extension provides double the computation capabilities of that of AVX/AV2.

AVX512F - AVX-512 Foundation is base of the 512-bit SIMD instruction extensions which is a comprehensive list of features for most HPC and enterprise applications. AVX-512 Foundation is the natural extensions to AVX/AVX2 which is extended using the EVEX prefix which builds on the existing VEX prefix. Any processor that implements any portion of the AVX-512 extensions MUST implement AVX512F.

AVX512CD - AVX-512 Conflict Detection Instructions offer additional vectorization of loops with possible address conflict.

AVX512DQ - AVX-512 Doubleword and Quadword Instructions add new 32-bit and 64-bit AVX-512 instructions

AVX512PF - AVX-512 Prefetch Instructions add new prefetch instructions for gather/scatter and PREFETCHWT1 .

AVX512ER - AVX-512 Exponential and Reciprocal Instructions (ERI) offer 28-bit precision RCP, RSQRT and EXP transcendentals for various scientific applications.

AVX512VL - AVX-512 Vector Length Instructions add vector length orthogonality, allowing most AVX-512 operations to also operate on XMM (128-bit, SSE) registers and YMM (256-bit, AVX) registers

AVX512BW - AVX-512 Byte and Word Instructions add support for for 8-bit and 16-bit integer operations.

AVX512IFMA - AVX-512 Integer Fused Multiply-Add Instructions add support for fused multiply add of integers using 52-bit precision.

AVX512VBMI - AVX-512 Vector Bit Manipulation Instructions add additional vector byte permutation instructions.

AVX512_4FMAPS - AVX-512 Fused Multiply Accumulation Packed Single precision Instructions add vector instructions for deep learning on floating-point single precision

AVX512_4VNNIW - AVX-512 Vector Neural Network Instructions Word Variable Precision Instructions add vector instructions for deep learning on enhanced word variable precision

AVX512_VPOPCNTDQ - AVX-512 Vector Population Count Doubleword and Quadword Instructions add double and quad word population count instructions.

@@ Line 5: / Line 5: @@
 AVX-512 is a set of {{arch|512}} [[SIMD]] extensions that allow programs to pack sixteen [[single-precision]] eight [[double-precision]] [[floating-point]] numbers, or eight 64-bit or sixteen 32-bit integers within 512-bit vectors. The extension provides double the computation capabilities of that of {{x86|AVX}}/{{x86|AV2}}.
-* {{x86|AVX512F|'''AVX-512 Foundation'''}} ('''AVX512F''') is base of the 512-bit SIMD instruction extensions which is a comprehensive list of features for most HPC and enterprise applications. AVX-512 Foundation is the natural extensions to AVX/AVX2 which is extended using the {{x86|EVEX}} prefix which builds on the existing {{x86|VEX}} prefix. Any processor that implements any portion of the AVX-512 extensions MUST implement AVX512F.
+* '''AVX512F''' - {{x86|AVX512F|'''AVX-512 Foundation'''}} is base of the 512-bit SIMD instruction extensions which is a comprehensive list of features for most HPC and enterprise applications. AVX-512 Foundation is the natural extensions to AVX/AVX2 which is extended using the {{x86|EVEX}} prefix which builds on the existing {{x86|VEX}} prefix. Any processor that implements any portion of the AVX-512 extensions MUST implement AVX512F.
-* {{x86|AVX512CD|'''AVX-512 Conflict Detection'''}} ('''AVX512CD''') Instructions offer additional vectorization of loops with possible address conflict.
+* '''AVX512CD''' - {{x86|AVX512CD|'''AVX-512 Conflict Detection'''}} Instructions offer additional vectorization of loops with possible address conflict.
-* {{x86|AVX512DQ|'''AVX-512 Doubleword and Quadword'''}} ('''AVX512DQ''') Instructions add new 32-bit and 64-bit AVX-512 instructions
+* '''AVX512DQ''' - {{x86|AVX512DQ|'''AVX-512 Doubleword and Quadword'''}} Instructions add new 32-bit and 64-bit AVX-512 instructions
-* {{x86|AVX512PF|'''AVX-512 Prefetch'''}} ('''AVX512PF''') Instructions add new prefetch instructions for gather/scatter and PREFETCHWT1 .
+* '''AVX512PF''' - {{x86|AVX512PF|'''AVX-512 Prefetch'''}} Instructions add new prefetch instructions for gather/scatter and PREFETCHWT1 .
-* {{x86|AVX512ER|'''AVX-512 Exponential and Reciprocal'''}} ('''AVX512ER''') Instructions ('''ERI''') offer 28-bit precision RCP, RSQRT and EXP transcendentals for various scientific applications.
+* '''AVX512ER''' - {{x86|AVX512ER|'''AVX-512 Exponential and Reciprocal'''}} Instructions ('''ERI''') offer 28-bit precision RCP, RSQRT and EXP transcendentals for various scientific applications.
-* {{x86|AVX512VL|'''AVX-512 Vector Length'''}} ('''AVX512VL''') Instructions add vector length orthogonality, allowing most AVX-512 operations to also operate on {{x86|XMM}} (128-bit, {{x86|SSE}}) registers and {{x86|YMM}} (256-bit, {{x86|AVX}}) registers
+* '''AVX512VL''' - {{x86|AVX512VL|'''AVX-512 Vector Length'''}} Instructions add vector length orthogonality, allowing most AVX-512 operations to also operate on {{x86|XMM}} (128-bit, {{x86|SSE}}) registers and {{x86|YMM}} (256-bit, {{x86|AVX}}) registers
-* {{x86|AVX512BW|'''AVX-512 Byte and Word'''}} ('''AVX512BW''') Instructions add support for for 8-bit and 16-bit integer operations.
+* '''AVX512BW''' - {{x86|AVX512BW|'''AVX-512 Byte and Word'''}} Instructions add support for for 8-bit and 16-bit integer operations.
-* {{x86|AVX512IFMA|'''AVX-512 Integer Fused Multiply-Add'''}} ('''AVX512IFMA''') Instructions add support for fused multiply add of integers using 52-bit precision.
+* '''AVX512IFMA''' - {{x86|AVX512IFMA|'''AVX-512 Integer Fused Multiply-Add'''}} Instructions add support for fused multiply add of integers using 52-bit precision.
-* {{x86|AVX512VBMI|'''AVX-512 Vector Bit Manipulation'''}} ('''AVX512VBMI''') Instructions add additional vector byte permutation instructions.
+* '''AVX512VBMI''' - {{x86|AVX512VBMI|'''AVX-512 Vector Bit Manipulation'''}} Instructions add additional vector byte permutation instructions.
-* {{x86|AVX512_4FMAPS|'''AVX-512 Fused Multiply Accumulation Packed Single precision'''}} ('''AVX512_4FMAPS''') Instructions add vector instructions for deep learning on floating-point single precision
+* '''AVX512_4FMAPS''' - {{x86|AVX512_4FMAPS|'''AVX-512 Fused Multiply Accumulation Packed Single precision'''}} Instructions add vector instructions for deep learning on floating-point single precision
-* {{x86|AVX512_4VNNIW|'''AVX-512 Vector Neural Network Instructions Word Variable Precision'''}} ('''AVX512_4VNNIW''') Instructions add vector instructions for deep learning on enhanced word variable precision
+* '''AVX512_4VNNIW''' - {{x86|AVX512_4VNNIW|'''AVX-512 Vector Neural Network Instructions Word Variable Precision'''}} Instructions add vector instructions for deep learning on enhanced word variable precision
-* {{x86|AVX512_VPOPCNTDQ|'''AVX-512 VPOPCNTDQ Vector Population Count Double Word and Quad Word'''}} ('''AVX512_VPOPCNTDQ''') Instructions add double and quad word population count instructions.
+* '''AVX512_VPOPCNTDQ''' - {{x86|AVX512_VPOPCNTDQ|'''AVX-512 Vector Population Count Doubleword and Quadword'''}} Instructions add double and quad word population count instructions.

WikiChip

The Fuse Coverage

Social Media

Companies

Microarchitectures

Technology Nodes

Intel

AMD

ARM

Cavium

Samsung

Intel

AMD

Ampere

Apple

Cavium

HiSilicon

MediaTek

NXP

Qualcomm

Renesas

Samsung

Revision as of 22:50, 9 July 2017

Overview