From WikiChip
Galois Field New Instructions (GFNI) - x86
x86
Instruction Set Architecture
Instruction Set Architecture
General
Variants
Topics
- Instructions
- Addressing Modes
- Calling Convention
- Microarchitectures
- Model-Specific Register
- CPUID
- Assembly
- Interrupts
- Registers
- Micro-Ops
- Timer
CPUIDs
Modes
Extensions(all)
Galois Field New Instructions (GFNI) is an x86 extension.
| This article is still a stub and needs your attention. You can help improve this article by editing this page and adding the missing information. |
Contents
Overview[edit]
-
GF2P8AFFINEQB - ...
-
GF2P8AFFINEINVQB - ...
-
GF2P8MULB - ...
Motivation[edit]
Detection[edit]
The GFNI feature flag indicates support for the SSE variant of these instructions operating on 128-bit vectors. 128- and 256-bit AVX versions are supported if the AVX flag is set as well.
The AVX-512 variants with EVEX encoding operating on 512-bit vectors are supported if the GFNI and AVX512F (Foundation) flags are set. The 128- and 256-bit versions with EVEX encoding are supported if the AVX512VL (Vector Length) flag is set as well.
| CPUID | Instruction Set | |
|---|---|---|
| Input | Output | |
| EAX=01H | ECX[bit 28] | AVX |
| EAX=07H, ECX=0 | EBX[bit 16] | AVX512F |
| EAX=07H, ECX=0 | EBX[bit 31] | AVX512VL |
| EAX=07H, ECX=0 | ECX[bit 08] | GFNI |
Microarchitecture support[edit]
| Designer | Microarchitecture | Year | Support Level | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GFNI SSE | GFNI AVX | GFNI AVX-512 (128/256) | GFNI AVX-512 (512) | ||||||||||||||||||
| AMD | Zen 4 | 2022 | ✔ | ✔ | ✔ | ✔ | |||||||||||||||
Intrinsic functions[edit]
// GF2P8AFFINEQB
__m128i _mm_gf2p8affine_epi64_epi8(__m128i, __m128i, int);
__m128i _mm_mask_gf2p8affine_epi64_epi8(__m128i, __mmask16, __m128i, __m128i, int);
__m128i _mm_maskz_gf2p8affine_epi64_epi8(__mmask16, __m128i, __m128i, int);
__m256i _mm256_gf2p8affine_epi64_epi8(__m256i, __m256i, int);
__m256i _mm256_mask_gf2p8affine_epi64_epi8(__m256i, __mmask32, __m256i, __m256i, int);
__m256i _mm256_maskz_gf2p8affine_epi64_epi8(__mmask32, __m256i, __m256i, int);
__m512i _mm512_gf2p8affine_epi64_epi8(__m512i, __m512i, int);
__m512i _mm512_mask_gf2p8affine_epi64_epi8(__m512i, __mmask64, __m512i, __m512i, int);
__m512i _mm512_maskz_gf2p8affine_epi64_epi8(__mmask64, __m512i, __m512i, int);
// GF2P8AFFINEINVQB
__m128i _mm_gf2p8affineinv_epi64_epi8(__m128i, __m128i, int);
__m128i _mm_mask_gf2p8affineinv_epi64_epi8(__m128i, __mmask16, __m128i, __m128i, int);
__m128i _mm_maskz_gf2p8affineinv_epi64_epi8(__mmask16, __m128i, __m128i, int);
__m256i _mm256_gf2p8affineinv_epi64_epi8(__m256i, __m256i, int);
__m256i _mm256_mask_gf2p8affineinv_epi64_epi8(__m256i, __mmask32, __m256i, __m256i, int);
__m256i _mm256_maskz_gf2p8affineinv_epi64_epi8(__mmask32, __m256i, __m256i, int);
__m512i _mm512_gf2p8affineinv_epi64_epi8(__m512i, __m512i, int);
__m512i _mm512_mask_gf2p8affineinv_epi64_epi8(__m512i, __mmask64, __m512i, __m512i, int);
__m512i _mm512_maskz_gf2p8affineinv_epi64_epi8(__mmask64, __m512i, __m512i, int);
// VGF2P8MULB
__m128i _mm_gf2p8mul_epi8(__m128i, __m128i);
__m128i _mm_mask_gf2p8mul_epi8(__m128i, __mmask16, __m128i, __m128i);
__m128i _mm_maskz_gf2p8mul_epi8(__mmask16, __m128i, __m128i);
__m256i _mm256_gf2p8mul_epi8(__m256i, __m256i);
__m256i _mm256_mask_gf2p8mul_epi8(__m256i, __mmask32, __m256i, __m256i);
__m256i _mm256_maskz_gf2p8mul_epi8(__mmask32, __m256i, __m256i);
__m512i _mm512_gf2p8mul_epi8(__m512i, __m512i);
__m512i _mm512_mask_gf2p8mul_epi8(__m512i, __mmask64, __m512i, __m512i);
__m512i _mm512_maskz_gf2p8mul_epi8(__mmask64, __m512i, __m512i);