From WikiChip
Galois Field New Instructions (GFNI) - x86
x86
Instruction Set Architecture
Instruction Set Architecture
General
Variants
Topics
- Instructions
- Addressing Modes
- Registers
- Model-Specific Register
- Assembly
- Interrupts
- Micro-Ops
- Timer
- Calling Convention
- Microarchitectures
- CPUID
CPUIDs
Modes
Extensions(all)
Galois Field New Instructions (GFNI) is an x86 extension.
This article is still a stub and needs your attention. You can help improve this article by editing this page and adding the missing information. |
Contents
Overview
-
GF2P8AFFINEQB
- ...
-
GF2P8AFFINEINVQB
- ...
-
GF2P8MULB
- ...
Motivation
Detection
The GFNI feature flag indicates support for the SSE variant of these instructions operating on 128-bit vectors. 128- and 256-bit AVX versions are supported if the AVX flag is set as well.
The AVX-512 variants with EVEX encoding operating on 512-bit vectors are supported if the GFNI and AVX512F (Foundation) flags are set. The 128- and 256-bit versions with EVEX encoding are supported if the AVX512VL (Vector Length) flag is set as well.
CPUID | Instruction Set | |
---|---|---|
Input | Output | |
EAX=01H | ECX[bit 28] | AVX |
EAX=07H, ECX=0 | EBX[bit 16] | AVX512F |
EAX=07H, ECX=0 | EBX[bit 31] | AVX512VL |
EAX=07H, ECX=0 | ECX[bit 08] | GFNI |
Microarchitecture support
Designer | Microarchitecture | Year | Support Level | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
GFNI SSE | GFNI AVX | GFNI AVX-512 (128/256) | GFNI AVX-512 (512) | ||||||||||||||||||
AMD | Zen 4 | 2022 | ✔ | ✔ | ✔ | ✔ |
Intrinsic functions
// GF2P8AFFINEQB
__m128i _mm_gf2p8affine_epi64_epi8(__m128i, __m128i, int);
__m128i _mm_mask_gf2p8affine_epi64_epi8(__m128i, __mmask16, __m128i, __m128i, int);
__m128i _mm_maskz_gf2p8affine_epi64_epi8(__mmask16, __m128i, __m128i, int);
__m256i _mm256_gf2p8affine_epi64_epi8(__m256i, __m256i, int);
__m256i _mm256_mask_gf2p8affine_epi64_epi8(__m256i, __mmask32, __m256i, __m256i, int);
__m256i _mm256_maskz_gf2p8affine_epi64_epi8(__mmask32, __m256i, __m256i, int);
__m512i _mm512_gf2p8affine_epi64_epi8(__m512i, __m512i, int);
__m512i _mm512_mask_gf2p8affine_epi64_epi8(__m512i, __mmask64, __m512i, __m512i, int);
__m512i _mm512_maskz_gf2p8affine_epi64_epi8(__mmask64, __m512i, __m512i, int);
// GF2P8AFFINEINVQB
__m128i _mm_gf2p8affineinv_epi64_epi8(__m128i, __m128i, int);
__m128i _mm_mask_gf2p8affineinv_epi64_epi8(__m128i, __mmask16, __m128i, __m128i, int);
__m128i _mm_maskz_gf2p8affineinv_epi64_epi8(__mmask16, __m128i, __m128i, int);
__m256i _mm256_gf2p8affineinv_epi64_epi8(__m256i, __m256i, int);
__m256i _mm256_mask_gf2p8affineinv_epi64_epi8(__m256i, __mmask32, __m256i, __m256i, int);
__m256i _mm256_maskz_gf2p8affineinv_epi64_epi8(__mmask32, __m256i, __m256i, int);
__m512i _mm512_gf2p8affineinv_epi64_epi8(__m512i, __m512i, int);
__m512i _mm512_mask_gf2p8affineinv_epi64_epi8(__m512i, __mmask64, __m512i, __m512i, int);
__m512i _mm512_maskz_gf2p8affineinv_epi64_epi8(__mmask64, __m512i, __m512i, int);
// VGF2P8MULB
__m128i _mm_gf2p8mul_epi8(__m128i, __m128i);
__m128i _mm_mask_gf2p8mul_epi8(__m128i, __mmask16, __m128i, __m128i);
__m128i _mm_maskz_gf2p8mul_epi8(__mmask16, __m128i, __m128i);
__m256i _mm256_gf2p8mul_epi8(__m256i, __m256i);
__m256i _mm256_mask_gf2p8mul_epi8(__m256i, __mmask32, __m256i, __m256i);
__m256i _mm256_maskz_gf2p8mul_epi8(__mmask32, __m256i, __m256i);
__m512i _mm512_gf2p8mul_epi8(__m512i, __m512i);
__m512i _mm512_mask_gf2p8mul_epi8(__m512i, __mmask64, __m512i, __m512i);
__m512i _mm512_maskz_gf2p8mul_epi8(__mmask64, __m512i, __m512i);