From WikiChip
Galois Field New Instructions (GFNI) - x86
< x86
Revision as of 14:41, 15 March 2023 by QuietRub (talk | contribs) (Created page with "{{x86 title|Galois Field New Instructions (GFNI)}}{{x86 isa main}} '''Galois Field New Instructions''' ('''GFNI''') is an x86 extension. {{stub}} == Overview == ; <code>...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Galois Field New Instructions (GFNI) is an x86 extension.

Text document with shapes.svg This article is still a stub and needs your attention. You can help improve this article by editing this page and adding the missing information.


Overview

GF2P8AFFINEQB
...
GF2P8AFFINEINVQB
...
GF2P8MULB
...

Motivation

Detection

The GFNI feature flag indicates support for the SSE variant of these instructions operating on 128-bit vectors. 128- and 256-bit AVX versions are supported if the AVX flag is set as well.

The AVX-512 variants with EVEX encoding operating on 512-bit vectors are supported if the GFNI and AVX512F (Foundation) flags are set. The 128- and 256-bit versions with EVEX encoding are supported if the AVX512VL (Vector Length) flag is set as well.

CPUID Instruction Set
Input Output
EAX=01H ECX[bit 28] AVX
EAX=07H, ECX=0 EBX[bit 16] AVX512F
EAX=07H, ECX=0 EBX[bit 31] AVX512VL
EAX=07H, ECX=0 ECX[bit 08] GFNI

Microarchitecture support

Designer Microarchitecture Year Support Level
GFNI SSE GFNI AVX GFNI AVX-512 (128/256) GFNI AVX-512 (512)
AMD Zen 4 2022

Intrinsic functions

// GF2P8AFFINEQB
__m128i _mm_gf2p8affine_epi64_epi8(__m128i, __m128i, int);
__m128i _mm_mask_gf2p8affine_epi64_epi8(__m128i, __mmask16, __m128i, __m128i, int);
__m128i _mm_maskz_gf2p8affine_epi64_epi8(__mmask16, __m128i, __m128i, int);
__m256i _mm256_gf2p8affine_epi64_epi8(__m256i, __m256i, int);
__m256i _mm256_mask_gf2p8affine_epi64_epi8(__m256i, __mmask32, __m256i, __m256i, int);
__m256i _mm256_maskz_gf2p8affine_epi64_epi8(__mmask32, __m256i, __m256i, int);
__m512i _mm512_gf2p8affine_epi64_epi8(__m512i, __m512i, int);
__m512i _mm512_mask_gf2p8affine_epi64_epi8(__m512i, __mmask64, __m512i, __m512i, int);
__m512i _mm512_maskz_gf2p8affine_epi64_epi8(__mmask64, __m512i, __m512i, int);
// GF2P8AFFINEINVQB
__m128i _mm_gf2p8affineinv_epi64_epi8(__m128i, __m128i, int);
__m128i _mm_mask_gf2p8affineinv_epi64_epi8(__m128i, __mmask16, __m128i, __m128i, int);
__m128i _mm_maskz_gf2p8affineinv_epi64_epi8(__mmask16, __m128i, __m128i, int);
__m256i _mm256_gf2p8affineinv_epi64_epi8(__m256i, __m256i, int);
__m256i _mm256_mask_gf2p8affineinv_epi64_epi8(__m256i, __mmask32, __m256i, __m256i, int);
__m256i _mm256_maskz_gf2p8affineinv_epi64_epi8(__mmask32, __m256i, __m256i, int);
__m512i _mm512_gf2p8affineinv_epi64_epi8(__m512i, __m512i, int);
__m512i _mm512_mask_gf2p8affineinv_epi64_epi8(__m512i, __mmask64, __m512i, __m512i, int);
__m512i _mm512_maskz_gf2p8affineinv_epi64_epi8(__mmask64, __m512i, __m512i, int);
// VGF2P8MULB
__m128i _mm_gf2p8mul_epi8(__m128i, __m128i);
__m128i _mm_mask_gf2p8mul_epi8(__m128i, __mmask16, __m128i, __m128i);
__m128i _mm_maskz_gf2p8mul_epi8(__mmask16, __m128i, __m128i);
__m256i _mm256_gf2p8mul_epi8(__m256i, __m256i);
__m256i _mm256_mask_gf2p8mul_epi8(__m256i, __mmask32, __m256i, __m256i);
__m256i _mm256_maskz_gf2p8mul_epi8(__mmask32, __m256i, __m256i);
__m512i _mm512_gf2p8mul_epi8(__m512i, __m512i);
__m512i _mm512_mask_gf2p8mul_epi8(__m512i, __mmask64, __m512i, __m512i);
__m512i _mm512_maskz_gf2p8mul_epi8(__mmask64, __m512i, __m512i);

Bibliography