From WikiChip
Editing flops
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone.
Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
This page supports semantic in-text annotations (e.g. "[[Is specified as::World Heritage Site]]") to build structured and queryable content provided by Semantic MediaWiki. For a comprehensive description on how to use annotations or the #ask parser function, please have a look at the getting started, in-text annotation, or inline queries help pages.
Latest revision | Your text | ||
Line 1: | Line 1: | ||
{{title|Floating-Point Operations Per Second (FLOPS)}} | {{title|Floating-Point Operations Per Second (FLOPS)}} | ||
− | '''Floating-point operations per second''' ('''FLOPS''') is a | + | '''Floating-point operations per second''' ('''FLOPS''') is a microprocessor performance unit used to quantify the number of [[floating-point]] [[floating-point operations|operations]] a [[physical core|core]], machine, or system is capable of in a one second. |
== Overview == | == Overview == | ||
Line 22: | Line 22: | ||
:<math>\text{FLOPS}_\text{system} = \frac{\text{instructions}}{\text{cycle}} \times \frac{\text{operations}}{\text{instruction}} \times \frac{\text{FLOPs}}{\text{operation}} \times \frac{\text{cycles}}{\text{second}} \times \frac{\text{cores}}{\text{node}} \times \frac{\text{nodes}}{\text{system}}</math> | :<math>\text{FLOPS}_\text{system} = \frac{\text{instructions}}{\text{cycle}} \times \frac{\text{operations}}{\text{instruction}} \times \frac{\text{FLOPs}}{\text{operation}} \times \frac{\text{cycles}}{\text{second}} \times \frac{\text{cores}}{\text{node}} \times \frac{\text{nodes}}{\text{system}}</math> | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
== FLOPs by microarchitecture == | == FLOPs by microarchitecture == | ||
Line 53: | Line 43: | ||
| '''SP''' || 16 FLOPs/cycle || 8 FLOPs + 8 FLOPs | | '''SP''' || 16 FLOPs/cycle || 8 FLOPs + 8 FLOPs | ||
|- | |- | ||
− | | rowspan="3" | {{intel|Haswell|l=arch}}<br>{{intel|Broadwell|l=arch}}<br>{{intel|Skylake|l=arch}}<br>{{intel|Kaby Lake|l=arch}}<br>{{intel| | + | | rowspan="3" | {{intel|Haswell|l=arch}}<br>{{intel|Broadwell|l=arch}}<br>{{intel|Skylake|l=arch}}<br>{{intel|Kaby Lake|l=arch}}<br>{{intel|Coffee Lake|l=arch}}<br>{{intel|Whiskey Lake|l=arch}}<br>{{intel|Amber Lake|l=arch}} || '''EUs''' || colspan="2" | 2 × 256-bit FMA || rowspan="3" | {{x86|AVX2}} & FMA (256-bit) |
|- | |- | ||
| '''DP''' || 16 FLOPs/cycle || 2 × 8 FLOPs | | '''DP''' || 16 FLOPs/cycle || 2 × 8 FLOPs | ||
Line 60: | Line 50: | ||
|- | |- | ||
| rowspan="3" | {{intel|Skylake (server)|l=arch}} || '''EUs''' || colspan="2" | 2 × 512-bit FMA (varies by SKU) || rowspan="3" | {{x86|AVX-512}} & FMA (512-bit) | | rowspan="3" | {{intel|Skylake (server)|l=arch}} || '''EUs''' || colspan="2" | 2 × 512-bit FMA (varies by SKU) || rowspan="3" | {{x86|AVX-512}} & FMA (512-bit) | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
| '''DP''' || 32 FLOPs/cycle || 2 × 16 FLOPs | | '''DP''' || 32 FLOPs/cycle || 2 × 16 FLOPs | ||
Line 73: | Line 57: | ||
! colspan="5" | [[Intel]] {{intel|MIC}} Microarchitectures | ! colspan="5" | [[Intel]] {{intel|MIC}} Microarchitectures | ||
|- | |- | ||
− | | rowspan="3" | {{intel|Knights Landing|l=arch}} || '''EUs''' || colspan="2" | 2 × 512-bit FMA || rowspan="3" | {{x86|AVX-512}} & FMA (512-bit) | + | | rowspan="3" | {{intel|Knights Landing|l=arch}} || '''EUs''' || colspan="2" | 2 × 512-bit FMA (varies by SKU) || rowspan="3" | {{x86|AVX-512}} & FMA (512-bit) |
|- | |- | ||
| '''DP''' || 32 FLOPs/cycle || 2 × 16 FLOPs | | '''DP''' || 32 FLOPs/cycle || 2 × 16 FLOPs | ||
Line 93: | Line 77: | ||
| '''SP''' || 16 FLOPs/cycle || 2 x 8 FLOPs | | '''SP''' || 16 FLOPs/cycle || 2 x 8 FLOPs | ||
|- | |- | ||
− | | rowspan="3" | {{amd|Zen|l=arch}}<br>{{amd|Zen+|l=arch}} || '''EUs''' || colspan="2" | 2 × 128-bit FMA || rowspan="3" | {{x86|AVX2}} & FMA ( | + | | rowspan="3" | {{amd|Zen|l=arch}}<br>{{amd|Zen+|l=arch}} || '''EUs''' || colspan="2" | 2 × 128-bit FMA || rowspan="3" | {{x86|AVX2}} & FMA (128-bit) |
|- | |- | ||
| '''DP''' || 8 FLOPs/cycle || 2 x 4 FLOPs | | '''DP''' || 8 FLOPs/cycle || 2 x 4 FLOPs | ||
Line 99: | Line 83: | ||
| '''SP''' || 16 FLOPs/cycle || 2 x 8 FLOPs | | '''SP''' || 16 FLOPs/cycle || 2 x 8 FLOPs | ||
|- | |- | ||
− | | rowspan="3" | {{amd|Zen 2 | + | | rowspan="3" | {{amd|Zen 2|l=arch}} || '''EUs''' || colspan="2" | 2 × 256-bit FMA || rowspan="3" | {{x86|AVX2}} & FMA (256-bit) |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
| '''DP''' || 16 FLOPs/cycle || 2 x 8 FLOPs | | '''DP''' || 16 FLOPs/cycle || 2 x 8 FLOPs | ||
Line 121: | Line 97: | ||
! colspan="5" | [[ARM]] Microarchitectures | ! colspan="5" | [[ARM]] Microarchitectures | ||
|- | |- | ||
− | | rowspan="3" | {{armh|Cortex-A57|l=arch}} || '''EUs''' || colspan="2" | 1 × 128-bit FMA || rowspan="3" | {{arm|ARMv8 | + | | rowspan="3" | {{armh|Cortex-A57|l=arch}} || '''EUs''' || colspan="2" | 1 × 128-bit FMA || rowspan="3" | {{arm|ARMv8}} (128-bit) |
|- | |- | ||
| '''DP''' || 4 FLOPs/cycle || 4 FLOPs | | '''DP''' || 4 FLOPs/cycle || 4 FLOPs | ||
Line 127: | Line 103: | ||
| '''SP''' || 8 FLOPs/cycle || 8 FLOPs | | '''SP''' || 8 FLOPs/cycle || 8 FLOPs | ||
|- | |- | ||
− | | rowspan="3" | {{armh|Cortex-A76 | + | | rowspan="3" | {{armh|Cortex-A76|l=arch}} || '''EUs''' || colspan="2" | 2 × 128-bit FMA || rowspan="3" | {{arm|ARMv8}} (128-bit) |
|- | |- | ||
| '''DP''' || 8 FLOPs/cycle || 2 x 4 FLOPs | | '''DP''' || 8 FLOPs/cycle || 2 x 4 FLOPs | ||
|- | |- | ||
| '''SP''' || 16 FLOPs/cycle || 2 x 8 FLOPs | | '''SP''' || 16 FLOPs/cycle || 2 x 8 FLOPs | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
! colspan="5" | [[AppliedMicro]]/[[Ampere Computing]] Microarchitectures | ! colspan="5" | [[AppliedMicro]]/[[Ampere Computing]] Microarchitectures | ||
|- | |- | ||
− | | rowspan="3" | {{apm|Storm|l=arch}}<br>{{apm|Shadowcat|l=arch}}<br>{{apm|Skylark|l=arch}} || '''EUs''' || colspan="2" | 1 × 64-bit FMA || rowspan="3" | {{arm|ARMv8 | + | | rowspan="3" | {{apm|Storm|l=arch}}<br>{{apm|Shadowcat|l=arch}}<br>{{apm|Skylark|l=arch}} || '''EUs''' || colspan="2" | 1 × 64-bit FMA || rowspan="3" | {{arm|ARMv8}} (128-bit) |
|- | |- | ||
| '''DP''' || 2 FLOPs/cycle || 2 FLOPs | | '''DP''' || 2 FLOPs/cycle || 2 FLOPs | ||
Line 161: | Line 119: | ||
! colspan="5" | [[Cavium]] Microarchitectures | ! colspan="5" | [[Cavium]] Microarchitectures | ||
|- | |- | ||
− | | rowspan="3" | {{cavium|Vulcan|l=arch}} || '''EUs''' || colspan="2" | 2 × 128-bit FMA || rowspan="3" | {{arm|ARMv8 | + | | rowspan="3" | {{cavium|Vulcan|l=arch}} || '''EUs''' || colspan="2" | 2 × 128-bit FMA || rowspan="3" | {{arm|ARMv8}} (128-bit) |
|- | |- | ||
| '''DP''' || 8 FLOPs/cycle || 2 x 4 FLOPs | | '''DP''' || 8 FLOPs/cycle || 2 x 4 FLOPs | ||
Line 169: | Line 127: | ||
! colspan="5" | [[Samsung]] Microarchitectures | ! colspan="5" | [[Samsung]] Microarchitectures | ||
|- | |- | ||
− | | rowspan="3" | {{samsung|M1|l=arch}}<br>{{samsung|M2|l=arch}} || '''EUs''' || colspan="2" | 1 × 128-bit FMA + 1 × 128-bit Addition || rowspan="3" | {{arm|ARMv8 | + | | rowspan="3" | {{samsung|M1|l=arch}}<br>{{samsung|M2|l=arch}} || '''EUs''' || colspan="2" | 1 × 128-bit FMA + 1 × 128-bit Addition || rowspan="3" | {{arm|ARMv8}} (128-bit) |
|- | |- | ||
| '''DP''' || 6 FLOPs/cycle || 1 x 4 FLOPs + 1 x 2 FLOPs | | '''DP''' || 6 FLOPs/cycle || 1 x 4 FLOPs + 1 x 2 FLOPs | ||
Line 175: | Line 133: | ||
| '''SP''' || 12 FLOPs/cycle || 1 x 8 FLOPs + 1 x 4 FLOPs | | '''SP''' || 12 FLOPs/cycle || 1 x 8 FLOPs + 1 x 4 FLOPs | ||
|- | |- | ||
− | | rowspan="3" | {{samsung|M3|l=arch}} || '''EUs''' || colspan="2" | 3 × 128-bit FMA || rowspan="3" | {{arm|ARMv8 | + | | rowspan="3" | {{samsung|M3|l=arch}} || '''EUs''' || colspan="2" | 3 × 128-bit FMA || rowspan="3" | {{arm|ARMv8}} (128-bit) |
|- | |- | ||
| '''DP''' || 12 FLOPs/cycle || 3 x 4 FLOPs | | '''DP''' || 12 FLOPs/cycle || 3 x 4 FLOPs | ||
Line 183: | Line 141: | ||
! colspan="5" | [[Phytium]] Microarchitectures | ! colspan="5" | [[Phytium]] Microarchitectures | ||
|- | |- | ||
− | | rowspan="3" | {{phytium|Xiaomi|l=arch}} || '''EUs''' || colspan="2" | 1 × 128-bit FMA || rowspan="3" | {{arm|ARMv8 | + | | rowspan="3" | {{phytium|Xiaomi|l=arch}} || '''EUs''' || colspan="2" | 1 × 128-bit FMA || rowspan="3" | {{arm|ARMv8}} (128-bit) |
|- | |- | ||
| '''DP''' || 4 FLOPs/cycle || 1 x 4 FLOPs | | '''DP''' || 4 FLOPs/cycle || 1 x 4 FLOPs | ||
Line 191: | Line 149: | ||
! colspan="5" | [[HiSilicon]] Microarchitectures | ! colspan="5" | [[HiSilicon]] Microarchitectures | ||
|- | |- | ||
− | | rowspan="3" | {{hisilicon|TaiShan v110|l=arch}} || '''EUs''' || colspan="2" | 1 × 128-bit FMA || rowspan="3" | {{arm|ARMv8 | + | | rowspan="3" | {{hisilicon|TaiShan v110|l=arch}} || '''EUs''' || colspan="2" | 1 × 128-bit FMA || rowspan="3" | {{arm|ARMv8}} (128-bit) |
|- | |- | ||
| '''DP''' || 4 FLOPs/cycle || 1 x 4 FLOPs | | '''DP''' || 4 FLOPs/cycle || 1 x 4 FLOPs |