Difference between revisions of "x86/persistent memory extensions"

Latest revision as of 20:08, 13 May 2021

x86
Instruction Set Architecture

History
Families

AMD's CPUIDs
Intel's CPUIDs

Persistent memory extensions (PMEM) are a set of x86 instructions designed to improve the usability of working with storage-class memory.

Overview[edit]

Intel adopted the SNIA NVM Programming Model for working with persistent memory. This model allows for direct access (DAX) using byte-addressable operations (i.e., load/store), however, the persistence of the data in the cache is not guaranteed until it has entered the persistence domain. x86 provides a set of instructions for flushing cache lines in a more optimized way. In addition to existing x86 instructions such as non-temporal stores, CLFLUSH, and WBINVD (kernel only), two new instructions were added:

Instruction	Description
`CLFLUSHOPT`	Optimized CLFLUSH; Behaves similarly to `CLFLUSH` but without the serialization, thereby optimized for performance by allowing for some concurrency when executing multiple CLFLUSHOPT instructions back-to-back.
`CLWB`	Cache line write back; behaves similarly to `CLFLUSHOPT` but keeps the cache line valid (i.e., the cache line is flushed and then marked as no longer dirty) thereby optimized for performance by keeping the line in the cache, increasing the chance of a cache hit.

Both of the new instructions must follow by a SFENCE to ensure all flushes are completed before continuing.

Detection[edit]

CPUID		Instruction Set
Input	Output	Instruction Set
EAX=07H, ECX=0	EBX[bit 23]	CLFLUSHOPT
EAX=07H, ECX=0	EBX[bit 24]	CLWB

Microarchitecture support[edit]

Instruction	Introduction
Instruction	Intel	AMD
`CLFLUSHOPT`	Skylake (server) Skylake (client) Goldmont	Zen
`CLWB`	Skylake (server) Ice Lake (client)	Zen 2

Intrinsic functions[edit]

#include <immintrin.h>

# clflushopt
void _mm_clflushopt (void const * p)

# clwb
void _mm_clwb (void const * p)

@@ Line 11: / Line 11: @@
 | <code>CLFLUSHOPT</code> || Optimized CLFLUSH; Behaves similarly to <code>CLFLUSH</code> but without the serialization, thereby optimized for performance by allowing for some concurrency when executing multiple CLFLUSHOPT instructions back-to-back.
 |-
-| <code>CLWB</code> || Cache line write back; behaves similarly to <code>CLFLUSHOPT</code> but keeps the cache line valid (i.e., the cache line is flushed and then marked as no longer dirty) thereby optimized for performance by keeping the line in the cache, increasing the cache of a [[cache hit]].
+| <code>CLWB</code> || Cache line write back; behaves similarly to <code>CLFLUSHOPT</code> but keeps the cache line valid (i.e., the cache line is flushed and then marked as no longer dirty) thereby optimized for performance by keeping the line in the cache, increasing the chance of a [[cache hit]].
 |}

WikiChip

The Fuse Coverage

Social Media

Companies

Microarchitectures

Technology Nodes

Intel

AMD

ARM

Cavium

Samsung

Intel

AMD

Ampere

Apple

Cavium

HiSilicon

MediaTek

NXP

Qualcomm

Renesas