From WikiChip
Difference between revisions of "intel/frequency behavior"
< intel

(Base, Non-AVX Turbo, and AVX Turbo)
 
(17 intermediate revisions by 6 users not shown)
Line 7: Line 7:
 
Intel has implemented a number of mechanisms into their architectures to extract additional performance through higher frequency whenever the power and thermal budgets allow.
 
Intel has implemented a number of mechanisms into their architectures to extract additional performance through higher frequency whenever the power and thermal budgets allow.
  
 +
* {{intel|Intelligent Power Capability}}
 +
* {{intel|Enhanced Intel SpeedStep Technology}} (EIST) - Introduced with {{intel|Pentium M|l=arch}}, 2005
 
* {{intel|Dynamic Acceleration Technology}} (DAT) - Introduced with {{intel|Modified Pentium M|l=arch}}/{{intel|Core|l=arch}} 2006
 
* {{intel|Dynamic Acceleration Technology}} (DAT) - Introduced with {{intel|Modified Pentium M|l=arch}}/{{intel|Core|l=arch}} 2006
 
* {{intel|Turbo Boost Technology}} (TBT) - Introduced with {{intel|Nehalem|l=arch}} in 2008
 
* {{intel|Turbo Boost Technology}} (TBT) - Introduced with {{intel|Nehalem|l=arch}} in 2008
** Turbo Boost Technology 2.0 (TBT 2.0) - Introduced with {{intel|Sandy Bridge|l=arch}} in 2011
+
** Turbo Boost Technology 2.0 (TBT 2.0) - Introduced with {{intel|Sandy Bridge|l=arch}} in 2010
 
* {{intel|Speed Shift Technology}} (SST) - Introduced with {{intel|Skylake|l=arch}} in 2015
 
* {{intel|Speed Shift Technology}} (SST) - Introduced with {{intel|Skylake|l=arch}} in 2015
* {{intel|Turbo Boost Max Technology}} (TBMT) - Introduced with {{intel|Broadwell E|l=core}} in 2016
+
* {{intel|Turbo Boost Max Technology}} 3.0 (TBMT) - Introduced with {{intel|Broadwell E|l=core}} in 2016
 +
* {{intel|Thermal Velocity Boost}} (TVB) - Introduced with {{intel|Coffee Lake H|l=core}} in 2018
 +
* {{intel|Speed Select Technology}} (SST) - Introduced with {{intel|Cascade Lake|l=arch}} in 2019
 +
 
 +
== Base, LFM, HFM ==
 +
{| class="wikitable" style="float: left; margin: 10px;"
 +
|-
 +
! colspan="2" | Example [[P-State]] Table
 +
|-
 +
! Voltage !! Frequency
 +
|-
 +
| 1.21 V || 2.8 GHz (HFM)
 +
|-
 +
| 1.18 V || 2.4 GHz
 +
|-
 +
| 1.05 V || 2.0 GHz
 +
|-
 +
| 0.96 V || 1.6 GHz
 +
|-
 +
| 0.93 V || 1.3 GHz
 +
|-
 +
| 0.86 V || 900 MHz
 +
|-
 +
| 0.80 V || 600 MHz (LFM)
 +
|}
 +
With {{intel|EIST}}, which is found on pretty much every Intel's processor since the mid-2000, each processor comes with a series of frequencies and associated voltages (note that a tuple containing the voltage and frequency is called a [[P-State]]). An example frequency table is shown on the left. This table is stored within the read-only processor {{x86|model specific register}} (MSR) and is used to ensure that frequencies do not exceed the lower or upper bound. The lower bound is called the '''Low Frequency Mode''' ('''LFM''') and is the lowest frequency-voltage operating point for a given processor. The upper bound is called the '''High Frequency mode''' ('''HFM''') and is the highest frequency-voltage operating point. Note that the HFM frequency is usually referred to by its advertised name: '''Base Frequency'''.
 +
 
 +
Most of the time, the processor does very little work. In order to save power, the processor will drop into a lower [[P-State]] when not under any demanding workloads. The processor will switch around between the various P-States as needed and as dictated by the [[operating system]].
 +
 
 +
{{clear}}
  
 
== Base, Non-AVX Turbo, and AVX Turbo ==
 
== Base, Non-AVX Turbo, and AVX Turbo ==
[[File:mixed avx-normal workloads with avx512.png|right|400px]]
+
[[File:mixed avx-normal workloads with avx512.png|thumb|right|200px|Cores are grouped based on the workload characteristic being executed.]]
Because different workloads exhibit different [[die]] thermos and electrical characteristics, they also have different frequencies. Intel organizes workloads into three categories:
+
Because different workloads exhibit different [[die]] thermal and electrical characteristics, they also have different frequencies. Intel organizes workloads into three categories:
  
 
* '''Non-AVX''' - workloads such as SSE and simple (e.g., add/bit) integer vector operations and all other regular instructions.
 
* '''Non-AVX''' - workloads such as SSE and simple (e.g., add/bit) integer vector operations and all other regular instructions.
Line 27: Line 58:
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-
! Mode !! Example Workload !! Absolute Guaranteed<br>Lowest Frequency !! Absolute Guaranteed<br>Highest Frequency
+
! Mode !! Example Workload !! Absolute Guaranteed<br>Lowest Frequency !! Absolute<br>Highest Frequency
 
|-
 
|-
 
| Non-AVX || SSE, light AVX2 Integer Vector (non-MUL), All regular instruction || Base Frequency || Turbo Frequency
 
| Non-AVX || SSE, light AVX2 Integer Vector (non-MUL), All regular instruction || Base Frequency || Turbo Frequency
Line 36: Line 67:
 
|}
 
|}
  
== Historical behavior ==
+
=== Historical behavior ===
Prior to {{intel|Haswell|l=arch}}, {{x86|AVX2}} workload on one core meant all cores were capped at ''AVX2 Turbo'' frequency. This had the undesirable effect of reducing performance for non-AVX workloads on cores that were unrelated to the cores executing AVX2 workloads. This behavior was changed with {{intel|Broadwell|l=arch}} which grouped cores executing AVX2 workloads together and cores executing non-AVX workloads separately, allowing the former cores group to execute at the lower AVX2 turbo frequency while having the later cores group execute at full non-AVX2 turbo.
+
In {{intel|Haswell|l=arch}}, an {{x86|AVX2}} workload on one core meant all cores were capped at ''AVX2 Turbo'' frequency. This had the undesirable effect of reducing performance for non-AVX workloads on cores that were unrelated to the cores executing AVX2 workloads. This behavior was changed with {{intel|Broadwell|l=arch}} which grouped cores executing AVX2 workloads together and cores executing non-AVX workloads separately, allowing the former cores group to execute at the lower AVX2 turbo frequency while having the later cores group execute at full non-AVX2 turbo.
  
 
::[[File:broadwell avx turbo changes.png|700px]]
 
::[[File:broadwell avx turbo changes.png|700px]]
 +
 +
== See also ==
 +
* AMD's {{amd|Frequency Behavior}}
 +
 +
[[Category:power management mechanisms by intel]]

Latest revision as of 00:10, 1 June 2020

The Frequency Behavior of Intel's CPUs is complex and is governed by multiple mechanisms that perform dynamic frequency scaling based on the available headroom.

Overview[edit]

With the increasing transistor budget new features are added and the overall core grows in capabilities. Unfortunately, the power constraints have remained the same and in many situations have gotten more restrictive. The result is that despite the exponentially increasing density, the dark silicon's area is growing just as fast.

Intel has implemented a number of mechanisms into their architectures to extract additional performance through higher frequency whenever the power and thermal budgets allow.

Base, LFM, HFM[edit]

Example P-State Table
Voltage Frequency
1.21 V 2.8 GHz (HFM)
1.18 V 2.4 GHz
1.05 V 2.0 GHz
0.96 V 1.6 GHz
0.93 V 1.3 GHz
0.86 V 900 MHz
0.80 V 600 MHz (LFM)

With EIST, which is found on pretty much every Intel's processor since the mid-2000, each processor comes with a series of frequencies and associated voltages (note that a tuple containing the voltage and frequency is called a P-State). An example frequency table is shown on the left. This table is stored within the read-only processor model specific register (MSR) and is used to ensure that frequencies do not exceed the lower or upper bound. The lower bound is called the Low Frequency Mode (LFM) and is the lowest frequency-voltage operating point for a given processor. The upper bound is called the High Frequency mode (HFM) and is the highest frequency-voltage operating point. Note that the HFM frequency is usually referred to by its advertised name: Base Frequency.

Most of the time, the processor does very little work. In order to save power, the processor will drop into a lower P-State when not under any demanding workloads. The processor will switch around between the various P-States as needed and as dictated by the operating system.

Base, Non-AVX Turbo, and AVX Turbo[edit]

Cores are grouped based on the workload characteristic being executed.

Because different workloads exhibit different die thermal and electrical characteristics, they also have different frequencies. Intel organizes workloads into three categories:

  • Non-AVX - workloads such as SSE and simple (e.g., add/bit) integer vector operations and all other regular instructions.
  • AVX2 Heavy - workloads that make heavy use of complex AVX2 operations (e.g. floating point and integer vector multiplications). This also includes the various AVX-512 bit scanning, and other simple (i.e., non INT/FP MUL) operations.
  • AVX-512 Heavy - workloads that make use of complex AVX-512 operations, including operations such as floating point and integer vector multiplications.

The frequency of each core is determined independently based on the workload described above. That is, cores running Non-AVX workloads can enjoy the full regular turbo frequency, whereas cores executing AVX-512 or AVX2 will operate at their own designated turbo frequencies.

Due to all of that, each processor has the following properties:

Mode Example Workload Absolute Guaranteed
Lowest Frequency
Absolute
Highest Frequency
Non-AVX SSE, light AVX2 Integer Vector (non-MUL), All regular instruction Base Frequency Turbo Frequency
AVX2 Heavy All AVX2 operations, light AVX-512 (non-FP, Int Vect non-MUL) AVX2 Base AVX2 Turbo
AVX-512 Heavy All heavy AVX-512 operations AVX-512 Base AVX-512 Turbo

Historical behavior[edit]

In Haswell, an AVX2 workload on one core meant all cores were capped at AVX2 Turbo frequency. This had the undesirable effect of reducing performance for non-AVX workloads on cores that were unrelated to the cores executing AVX2 workloads. This behavior was changed with Broadwell which grouped cores executing AVX2 workloads together and cores executing non-AVX workloads separately, allowing the former cores group to execute at the lower AVX2 turbo frequency while having the later cores group execute at full non-AVX2 turbo.

broadwell avx turbo changes.png

See also[edit]